CA India Logo

CA V MANMOHAN

Back to Guides & Tools
Tech & AIMar 17, 2025

SQL Simplified: The Professional Handbook

📘 SQL Simplified

The 80/20 rule to mastering SQL. Learn the most important concepts quickly while building a complete professional foundation.

Preface

This SQL Handbook follows the 80/20 rule—helping you learn the most important SQL concepts quickly while building a complete professional foundation.

By the end of this guide, you will understand:

  • Internals: How databases store and retrieve data physically and logically.
  • Design: How to structure relationships (Normalization) and ensure integrity.
  • Querying: How to filter, join, and aggregate massive datasets.
  • Optimization: How to use Indexes and execution plans for speed.

1 Information Systems, Data, and Information

An Information System is an ecosystem of hardware, software, data, and processes used to collect, process, and distribute data.

Key Concepts & The DIKW Hierarchy

Wisdom
Knowledge
Information
Data

Fig 1. The DIKW Pyramid representing data transformation.

  • Data (The "What"): Raw, unprocessed facts. It has no context.
    Examples: 50, ABCD, 2026-01-01.
  • Information (The "So What"): Processed data that has context and meaning.
    Context: "50" is the price, "ABCD" is the product code.
  • Knowledge (The "How"): Information combined with patterns, experience, or rules.
    Insight: Sales of Product ABCD drop when the price hits 50.
  • Wisdom (The "Why"): Applied knowledge to make strategic decisions.

Example Transition

LevelContentContext/Meaning
Raw Data50, 2026-01-01Meaningless numbers/dates on their own.
Information50 Units Sold on 2026-01-01Answering: "How much did we sell today?"

Exercise

Task: Convert the following raw data into information based on a retail context: 5000, 2025-12-31

2 What is a Database and DBMS?

Database (The Vault)

A Database is an organized collection of structured data (usually stored on a hard drive) designed for rapid search and retrieval.

DBMS (The Librarian)

The Database Management System (DBMS) is the software layer that sits between the user and the raw data files. It manages the physical details of storage so you don't have to.

User / App

DBMS
(Software Layer)

Database (Disk)

Responsibilities of a DBMS

  • Data Storage & Retrieval: Efficiently writing to and reading from disk.
  • Concurrency Control: Ensuring two users don't overwrite the same record simultaneously (Locking).
  • Security: Managing users, roles, and access permissions.
  • Backup & Recovery: Logging transactions to restore data after a crash (Write-Ahead Logging).
  • Integrity Enforcement: Ensuring data follows defined rules (e.g., age cannot be negative).

Types of DBMS

  • Relational (RDBMS): Stores data in tables with rigid schemas. (MySQL, PostgreSQL, Oracle). Best for structured business data.
  • NoSQL: Flexible schemas for unstructured data. (MongoDB - Document, Redis - Key/Value). Best for rapid prototyping or caching.
  • NewSQL: Combines ACID guarantees of RDBMS with the scalability of NoSQL. (CockroachDB).

Exercise

Task: List two critical responsibilities of a DBMS.

3 When Do You Need a Database?

Deciding between a Spreadsheet (Excel) and a Database is a critical architectural choice.

Use a Database When:

  • Data Volume: You have millions of rows (Excel slows down after ~100k).
  • Multi-user Access: Multiple people need to write data at the exact same time.
  • Complexity: Data has strict relationships (e.g., A Customer must exist before an Order is created).
  • Integrity: You need to enforce data types (prevent typing "Twenty" in a "Price" column).

Avoid a Database When:

  • The dataset is tiny and static.
  • No relationships exist (flat list).
  • You need complex, free-form visualization instantly (Excel is better for quick graphing).

Exercise

Task: Should a real-time chat application use a DBMS?

4 Relational Databases

Relational databases represent data as a collection of two-dimensional tables (relations) linked by keys.

Components

  • Table (Relation): A grid representing a specific entity (e.g., Customers).
  • Row (Tuple): A single instance of that entity (e.g., Customer John Doe).
  • Column (Attribute): A specific property of the entity (e.g., Email Address).
  • Primary Key (PK): A column that uniquely identifies a row. Cannot be NULL.
  • Foreign Key (FK): A column that points to the Primary Key of another table, establishing a link.

Customers Table

Cust_ID (PK)NameEmail
101John Doejohn@example.com
102Jane Smithjane@example.com

Orders Table

Order_ID (PK)ProductCust_ID (FK)
A1Laptop101
A2Mouse101

Relationships

  • One-to-One: One user has one Passport.
  • One-to-Many: One Author has written many Books. (Most common).
  • Many-to-Many: Students and Classes. A student takes many classes; a class has many students. (Requires a "Junction Table" to resolve).

Exercise

Task: What is the specific purpose of a Primary Key?

5 Introduction to SQL

SQL (Structured Query Language) is the standard language for RDBMS.

Declarative Nature

SQL is declarative, meaning you tell the database what result you want (e.g., "Give me all active users"), and the DBMS Query Optimizer decides how to find it (indexes, scan order, etc.).

SQL Dialects

While there is an ANSI SQL standard, every vendor adds "flavor":

  • MySQL/MariaDB: Web standard, open source.
  • PostgreSQL: Highly advanced, strictly standards-compliant.
  • SQL Server (T-SQL): Microsoft ecosystem, procedural extensions.
  • Oracle (PL/SQL): Enterprise scale, complex procedural logic.

Exercise

Task: Write a query to get all rows from the books table.
SELECT * FROM books;

6 Types of SQL Statements (DDL vs DML)

SQL commands are grouped by their function.

DDL (Data Definition Language)

Defines the structure (schema) of the database.

  • CREATE: Builds new objects.
  • ALTER: Modifies existing structure.
  • DROP: Deletes the object entirely.
  • TRUNCATE: Removes all data but keeps the structure.

DML (Data Manipulation Language)

Manage the data within the structure.

  • SELECT: Reads data.
  • INSERT: Adds data.
  • UPDATE: Modifies data.
  • DELETE: Removes data.

Other Categories

  • DCL (Data Control): GRANT, REVOKE (Permissions).
  • TCL (Transaction Control): COMMIT, ROLLBACK (Saving/Undoing changes).

Exercise

Task: Classify the statement ALTER TABLE.

7 Creating Tables

The CREATE TABLE statement is the blueprint for your data.

Common Data Types

  • INT: Whole numbers.
  • DECIMAL(p,s): Exact currency.
  • VARCHAR(n): Variable-length text.
  • DATE / TIMESTAMP: Chronological data.
  • BOOLEAN: True/False.

Constraints

  • PRIMARY KEY: Unique + Not Null.
  • NOT NULL: Field must have a value.
  • UNIQUE: No duplicates allowed.
  • CHECK: Validates data.
  • DEFAULT: Fallback value.

Exercise

Task: Create an orders table with an ID, date, and amount.

8 Altering and Removing Tables

ALTER Capabilities

  • ADD column_name datatype: Extensions.
  • DROP COLUMN column_name: Cleanup.
  • MODIFY/ALTER COLUMN: Changing a data type.

DELETE vs. TRUNCATE vs. DROP

  • DELETE: Row-by-row removal. Slow. Can be rolled back.
  • TRUNCATE: De-allocates data pages. Instant. Cannot be rolled back.
  • DROP: Destroys the table definition and data.

Exercise

Task: Remove the column phone from table people.

9 Inserting Data

Syntax Variants

INSERT INTO users (name, id) VALUES ('John', 1);
INSERT INTO users (id, name) VALUES (1, 'A'), (2, 'B');

Exercise

Task: Insert 3 rows into users.

10 SELECT Statement

Capabilities

  • Projection: Choosing specific columns (SELECT name).
  • Selection: Choosing specific rows (WHERE id = 1).
  • Aliasing: Renaming columns (AS).

Exercise

Task: Select the name column and convert it to uppercase.

11 COUNT, DISTINCT, LIMIT

Aggregation & De-duplication

  • COUNT(*): Counts all rows.
  • DISTINCT: Removes duplicate rows.

Exercise

Task: Get the 2nd page of results, assuming 10 rows per page.

12 Filtering with WHERE

Operators

  • =, <>, <, >
  • AND, OR, NOT

Exercise

Task: Find names not equal to 'John'.

13 String Patterns and Ranges

Pattern Matching (LIKE)

  • %: Matches any sequence.
  • _: Matches exactly one character.

Ranges (BETWEEN)

BETWEEN x AND y: Inclusive.

Exercise

Task: Find users with ages 20 to 30.

14 Sorting Results

  • ASC: Ascending (Default).
  • DESC: Descending.

Exercise

Task: Sort by city (A-Z), then by age (Oldest to Youngest).

15 Grouping Results

GROUP BY collapses multiple rows into a single row based on shared values.

The Golden Rule

If you use GROUP BY, every column in the SELECT clause must either be:

  1. Part of the GROUP BY clause.
  2. Inside an aggregate function (SUM, AVG, COUNT).

Exercise

Task: Calculate total sales amount per month.

16 HAVING

WHERE: Filters rows before grouping.

HAVING: Filters groups after aggregation.

Exercise

Task: Find cities where the average salary is greater than 5000.

17 Getting Table and Column Details

A standardized set of read-only views containing data about the database itself.

Exercise

Task: List all tables in the current database.

18 Handling NULL Values

NULL means "Unknown". It is not zero, and it is not an empty string.

Functions

  • COALESCE(col, val): Returns the first non-null value.
  • IFNULL(col, val): MySQL specific.

Exercise

Task: Select salary, replacing NULLs with 0 for calculation safety.

19 Keys and Constraints

Constraints enforce Data Integrity.

  • Primary Key: The ID card of the row.
  • Foreign Key: The link to another table. Enforces Referential Integrity.

Exercise

Task: Why use constraints?

20 Updating Data

Safety First

Always write your WHERE clause before the SET clause.

Exercise

Task: Set user status to 'inactive' if they haven't logged in for a year.

21 Deleting Data

DELETE vs TRUNCATE

  • DELETE: Can use WHERE, slower.
  • TRUNCATE: Removes ALL rows, faster.

Exercise

Task: Delete all logs (speed is priority, no rollback needed).

22 Multiple Tables (Normalization)

  • 1NF: Cell values are atomic.
  • 2NF: All columns rely on the whole Primary Key.
  • 3NF: No transitive dependencies.

Exercise

Task: What is the main goal of normalization?

23 Joins Overview

Join Types

INNER JOIN
LEFT JOIN
RIGHT JOIN
FULL OUTER

*Illustrative

  • INNER JOIN: Returns match in both tables.
  • LEFT JOIN: All from Left, matches from Right.
  • RIGHT JOIN: Opposite of Left.
  • FULL OUTER JOIN: All from both.

Exercise

Task: Find users who have registered but never placed an order.

24 Built-in Functions

Exercise

Task: Round salary to 2 decimal places.

25 Date and Time Functions

  • NOW(): Current date + time.
  • DATEDIFF(): Days between two dates.

Exercise

Task: Find orders placed in the last 7 days.

26 Subqueries

A query inside a query.

Exercise

Task: Find users who have at least one order (using EXISTS).

27 Views

Exercise

Task: Create a view high_earners for employees earning > 10000.

28 Stored Procedures

Exercise

Task: Create a procedure to get user details by ID.

29 Transactions & ACID

ACID Properties

Atomicity: All or nothing.
Consistency: Valid states only.
Isolation: No interference.
Durability: Permanent save.

Exercise

Task: Which isolation level prevents reading uncommitted data (Dirty Reads)?

30 Indexes

An Index is a data structure (usually B-Tree) that improves speed.

Analogy

Like a book's index. Instead of reading every page (Full Table Scan), look up the page number.

Exercise

Task: Create a composite index on city and age.
🎯

Final Note

You now possess a 360° professional reference for SQL.

Share this Article

Value Your Opinion

Insightful discussions lead to better solutions. I'd love to hear your thoughts, questions, or feedback on this article.

Share Your Thoughts