Back to Blog
7 min. read

SQL Basics – Billennium

SQL (Structured Query Language) is a worldwide standard for defining, operating, and managing data in relational databases. It is used in products of most of the companies that produce database management software. There are several SQL dialects (the focus of this article will be SQL syntax in Microsoft technologies).

SQL is a standard tool for accessing data in various environments with diverse hardware and operating systems. SQL provides:

Data search in databases;

  • DML (Data Modification Language) – inserting, modifying, and deleting data from the databases;
  • DDL (Data Definition Language) – adding new tables to the databases;
  • DCL (Data Control Language) – defense against unauthorized access.

The queries (select) allow for the selection of specified information (columns) from the tables (table sets). The fundamental structure of a SQL query is constructed with 3 keywords enlisted below (in this exact order):


SELECT
– a list of columns or expressions we want to display

FROM – the source from which the displayed data comes from

WHERE – the conditions that filter the displayed data.

By adding up the clauses listed above, we get the full query syntax used to obtain the desired data:

SELECT

WHERE

GROUP BY

HAVING

ORDER BY

This article will consider the possibilities every keyword gives us when using them in SQL queries.

SQL Basics – SELECT and FROM clauses

SELECT and FROM clauses are the fundamentals of every query performed in the database. The SELECT clause is used mainly to choose the columns displayed in the query result. Each time specific columns are selected to be displayed, the source (a table or a set of tables) that contains the required columns must be defined. Therefore, applying the SELECT and FROM clauses in a single query is vital. Nevertheless, it should be noted that each clause is responsible for a different operation.

The SELECT clause, as we have already mentioned, is a keyword that enables the display of column names. However, it is possible to display more than just columns.The clause is responsible for returning objects in general – which means that within the SELECT clause, the following actions can be performed:

  • SELECT * – this expression displays all data from the source defined in the FROM clause.
  • TOP [number] – this expression, used with the SELECT clause, displays a specific number of the records (rows) for a given query.
  • DISTINCT – the expression that displays unique records (rows) within the query. The word is placed just after SELECT.
  • Merging columns – displaying common information for separate columns in the SELECT syntax is possible. In the presented example (Microsoft tool), the merger is performed by using ‘+’:
  • Aliases – an alias changes a column name in a displayed query result. This can be performed as shown in the following expression [column_name] as [new_column_name]:
  • The operation performed on the columns:
  • Aggregating functions – displaying certain aggregated (grouped) data may be necessary after SELECT. To perform such an operation, the following aggregating functions are used (when grouping, the GROUP BY clause needs to be applied as well):
  1. COUNT() – the function used to count the number of rows;
  2. SUM() – the function that sums up all values from a column;
  3. AVG() – this function calculates the median value of all the values from a column;
  4. MIN(), MAX() – these two functions return the smallest and the largest value for an argument, respectively.
  • Conditional expressions – certain conditional expressions can be used within the SELECT clause. They will return data that meet conditions specified in the given expression. For this article, let us study the CASE function.

CASE is a conditional expression used not only in the SELECT clause but in the case of ETL tests it is most likely to be found within the SELECT clause. The CASE syntax is as follows:

CASE WHEN [condition] THEN [value] ELSE [other value] END

The number of WHENTHEN conditions in the syntax may be any number – it only depends on how many different cases are analyzed.

  • Subqueries – subqueries will be described in more detail in the section dedicated to the FROM clause.
  • Text unrelated to the source – In the SELECT clause, displaying an ordinary text value that does not exist in the source is also possible.

The FROM clause is placed inside the SQL query syntax. This clause points to the source from which data defined in the SELECT clause is derived. Here are the examples of the operations that can be executed within the FROM clause:

  • FROM [table name] – this form is the most basic when defining data source, as it does not interfere with the source structure.
  • It is common to join the tables to retrieve data that is not stored in a single table. To this end, the JOIN syntax is used, which allows us to join multiple tables related by keys. JOIN syntax is:

TableA JOIN TableB ON tablea_key = tableb_key

Here are the types of joins that vary in terms of results one wants to obtain – the diagram represents different joins on the example of sets:

  • Subqueries – a subquery is, to put it in simple terms, a SELECT instruction embedded in a different SQL instruction, which provides data for the external structure. Subqueries can also be used in the SELECT and WHERE clauses. They can be settled one another multiple times – SQL does not limit the number of embeddings being performed. Here is an example of a subquery in the FROM clause:

SQL Basics – WHERE clause

The WHERE clause is applied if needed to build a query that returns data that meets specific criteria. The WHERE clause is a crucial part of the SELECT syntax as it is part of the query where we define the criteria influencing the query results. In the WHERE clause, a rule is formulated in relational algebra that refers to the selected conditions and specifies the limitations that need to be followed by the records to be chosen in a given query. If the record follows the limitations, it is placed in the result table.

Defining the conditions – the conditional expression is placed after the WHERE keyword. The expression may contain constants or table names included in the FROM clause.

  • The following six operators are used for comparing values in SQL:

– Equal to: =

– Not equal: to <> or !=

– Less than: < – Greater than: >

– Less than or equal to: <= – Greater than or equal to: >=

  • It is allowed to build arithmetic expressions for numeric values using +, -, *, /, and the brackets ().
  • Text comparison – two texts are equal if they contain the same symbols placed in the same order. With “inequality” tests, i.e., when making comparisons like < or >=, the outcome of the comparison is decided upon whether the symbols in the text on the left are alphabetically before those from the text on the right side of the expression.
  • Operations carried out in the WHERE clause fall under the rules of Boolean logic – the result is either TRUE or FALSE. When the result of the expression is TRUE, a row is selected. Otherwise, it is omitted. Here are the rules of how the conditions can be combined:

AND operator returns TRUE value when expressions on both sides of the operator are matching – should even one of them not match, the whole expression is returned as FALSE

OR operator returns TRUE value when one of the expressions on the right or the left side of the operator is valid – when both expressions are valid, the result also will be TRUE

NOT operator is used to contradict the values in the expression

Multiple logical operators can create complex WHERE instructions where several expressions are used. It must be remembered that when building such an expression, it is crucial to keep the expressions to perform calculations correctly. The Priority of the expressions used in expressions is as follows:

comparison operators, logical operators

  • Conditions for NULL value – NULL values do not undergo any comparison operations since they are treated as unknown values. SQL allows testing fields in search for NULL values:

– Using the IS NULL expression in the WHERE clause is performed to check if the fields contain a NULL value. If IS appears in place of a standard comparison operator, NULL is not taken into inverted commas.

– Data can be searched for objects that contain values. In this situation, the IS NOT NULL expression is used.

  • IN – the more complex the query is, the more difficult it is to keep the operations in order. In such instances, using brackets, which in the WHERE clause help arrange expressions, becomes mandatory. SQL provides elements that considerably simplify the structure of a query with multiple logical operators. The IN clause works as multiple OR operators in validation rules that verify if a particular group of values can be found in a given column. The IN operator defines whether the test value is identical to at least one value from the list. The IN syntax is as follows:

IN (value 1, value 2, …, value n)

  • The NOT operator can contradict the logical value of an expression inside the IN clause. The IN clause selects all the rows where the tested value equals one of the values placed on the list. NOT IN picks the rows where the tested value is different from any value from the list;
  • BETWEEN – this clause and its contradiction – NOT BETWEEN – is used to check if the values belong (or not) to a given range and whether they are placed within boundary values.
  • LIKE – this operator allows working on the columns that contain string data type. The LIKE operator verifies whether the text value corresponds with a given example, so it enables performing partial comparisons like: “beginning with a text,” “ending with a text,” or “including a text.” When creating an example of a search item, wildcard characters are used:

% replaces a sequence of signs of any length (the length can be 0)

_ corresponds to a single sign in a searched text.

The WHERE clause with the LIKE operator looks the following way:

WHERE text LIKE example of a searched item

The above description defines the construction of the conditions (their logic). It should be mentioned that several system functions enable accurate forming of the conditions: these functions directly impact the records, not the logic.

SQL Basics – GROUP BY and HAVING clauses

GROUP BY and HAVING clauses are connected with aggregating functions, which were mentioned in the context of the SELECT clause.

SQL provides several aggregating functions that operate on a group of values a query returns and not on a single value. These functions allow us, for instance, to count the number of rows that meet specified criteria or calculate the meaning of all the values from a column. They may work on all column rows, a restricted group of rows selected by the WHERE clause, or groups of data defined by the GROUP BY clause.

  • COUNT(column_name) – this function counts the number of rows in a query. Should the number of rows be known, the simplest solution would be to use the COUNT(*) function (where the asterisk represents choosing all the table columns). COUNT(column_name) does not include fields with NULL values. The COUNT(*) function totals all rows regardless of their content. The fact that NULL values are not included in the calculation may be helpful when the NULL value has a crucial meaning, e.g., no defined salary.
  • SUM(column_name) and AVG(column_name) – the SUM() function sums up all values and returns a single result. The AVG() function calculates the mean from a group of values. Unlike the COUNT() function, which works with all types of data, the SUM() and AVG() functions work only for numeric arguments. The calculation ignores the NULL values in the case of SUM() and AVG() functions. Both functions can be used with the DISTINCT operator – if applied, the calculation is carried out only for unique values.
  • MIN(column_name) and MAX(column_name) – these functions search for the smallest and largest value in a set, respectively. Both functions can be used for different types of data. The MAX() function finds the largest data string (according to the string comparison rules), the most recent date (or the most distant date in the future), or the largest number in a juxtaposition. The MIN() function, on the other hand, finds the smallest values. The NULL value is treated as unspecified, so it cannot be compared with different values (the NULL values are ignored). The MAX and MIN functions can be used with the DISTINCT operator, but it makes no significant difference, as they all return only a single value from a juxtaposition.

The GROUP BY clause automatically divides query results into chosen categories. It allows us to group results by the contents of a selected table. If an aggregating function is used for more than one column, the function will only work with the criteria defined in the GROUP BY clause. It is essential to declare the column by which the query result is categorized in the SELECT expression.

The HAVING clause enables filtering (applying conditions on) grouped results obtained by the GROUP BY clause. The HAVING clause differs from the WHERE clause as WHERE filters query results before being grouped. In other words, WHERE is a condition for single records, while HAVING filters the results after the grouping is done. The expressions included in this clause are executed on whole groups, not single records. Including the expression from the SELECT clause in the HAVING clause is crucial.

HAVING and WHERE can be used in the same query.

SQL Basics – ORDER BY clause

The ORDER BY clause is used to sort query results. The results will be ordered by the contents of the column (or columns) defined in the ORDER BY clause. The results can be sorted alphabetically, by number or by date. The order of the displayed columns is not dependent on a column or the columns by which the ordering of query results was performed, as the columns keep their places no matter what column name is used in the ORDER BY clause.

Query results can be ordered in ascending (default option) or descending order. To sort the results in descending order, the DESC keyword has to be used in the ORDER BY clause. Sorting the result in ascending order applies to the ASC keyword, which is usually omitted.


Order of execution of a query:

The order of execution is strictly defined. Understanding the query order can help you diagnose why a given query doesn’t run properly and, even more frequently, will help you optimize your queries to run faster.

SQL Basics – Summary

This article has covered the fundamental concepts of SQL, providing a solid foundation for anyone looking to leverage data within relational databases. Understanding how to construct queries using key clauses like SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY opens up a world of possibilities for data analysis, reporting, and decision-making. SQL is not just a language – it’s a powerful tool used in various fields, from business intelligence and analytics to application development and database administration.

If you are interested in applying these skills in a professional environment, read more about the experience of our Process Management & QA team. Our team specializes in optimizing processes and ensuring quality through effective data management strategies. Dive into the world of SQL with us and see how it can transform your approach to data!