Best Way to Learn SQL for Data Scientist : Part 1

Best way to learn SQL featured image

SQL is one of the most essential skill for any Data scientist . Basic Data Mining , Uncovering the hidden parameters  and much more daily routine task for every data scientist are easily possible with basic SQL knowledge . If you are a beginner in Data science and looking for the best way to learn SQL for Data science then it is going to be the best place for you . So enjoy the party here , I mean go through the complete article . I promise it will not take much time .

Best way to learn SQL for Data Scientist-

We have designed the complete learning into various learning steps

  1. Create the schema ( DDL)
  2. Insert /update the data into Data bases
  3. Complex queries and result projection

We will try to end up only fist and second step in this article . After it complex queries and result projection  part will be done in next article .Before I go forward I will recommend you to go through the article Know why and how using SQL is Important in Data science

1.Create the schema ( DDL)

Make an assumption,  you are doing any Data science project in which You are using relational database .Please keep in the mind that there could be two situations-

  1. You don’t have any data and you are getting the data from external source . Now you need to dump into some SQL based database .Here you need this Create the schema section .
  2. In the second situation , You have data in existing database . Only thing you have to do is run some query . In this case this section will no longer in use . You may directly jump into complex queries and result projection section .

When you don’ t have the existing database , You need to create new one .

SQL Query to create Database :

CREATE DATABASE testDB ;               //Here testDB is database name

SQL Query to create Table :

CREATE TABLE test_table (

Col1  Datatype  Constraint ,

Col2  Datatype Constraint ,

Col3  Datatype Constraint ,

Col4  Datatype Constraint

);

Here Data types are varchar, int etc .

Other Required DDL command –

There are few other important DDL Commands like-

1.ALTER

2. DROP

3.TRUNCATE

What  are  Constraint in SQL –

When you create a table , There will be an option for you to add an constraint or not . Its completely situation oriented to add a constraint  into DDL script .  NOT NULL , UNIQUE , PRIMARY KEY , FORIEGN KEY , CHECK , DEFAULT , INDEX  are possible constraint  in SQL .

2.Insert /update the data into Data bases-

After successfully creation of schema . Its time to load  data into it  .This is one of the most easy task in SQL . There is a straight way command for this –

INSERT INTO table_name ( Column 1 , column 2  ,…..etc) VALUES (Val 1 , Val 2 ,……etc );

In case you want to insert values in every column of the Table , You may use-

INSERT INTO table_name VALUES (Val 1 , Val 2 ,……etc );

Now after insertion , If user / developer need to update any value in the Table , Here is the command for that –

UPDATE Table_name

SET Column_name = Value 

Where condition ;

3.Complex queries and result projection-

In the series of  best way to learn SQL , We have reached at the place where we know how to create Database , Table and how to insert data into the Table . Its time to taste the real water . Basically above part of the post was used to give a preview of SQL prerequisites for Data scientist / Data Analyst .In this Post I will introduce you with most used query in SQL –

SELECT –

We use this command when we want to see the Data from the Table .There is some variation of this command –

  1. For selecting entire table –

SELECT *FROM table_name;

       2.  For selecting few column from entire table –

SELECT column_name FROM table_name;

       3. From Projection of Distinct values –

SELECT DISTINCT column_name FROM table_name;

ORDER BY –

This statement is useful in setting order for result projection . I mean ascending and descending sorting in columns . You have to mention it and end part of select statement . I think syntax will help you understand better –

SELECT * FROM table_name WHERE  condition ORDER BY column_name;

Here you can mention more than one column after ORDER BY statement . You can also add ASC or DESC after column name to make sorting more specific .

SELECT * FROM table_name WHERE  condition ORDER BY column_name1  ASC ,column_name2  DESC ;

GROUP BY with HAVING clause –

I don’t know why but it is mostly asked SQL queries in interviews. In SQL , There are some aggregate function exist like AVG(Column_name),  SUM(Column_name),  COUNT(Column_name) etc . Here you can group these result by specific column . Here I will give you a complete syntax for GROUP BY query integrated with HAVING query command .

SELECT AVG(column_name 1), column_name 2
FROM table_name
WHERE condition
GROUP BY column_name 2
HAVING condition
ORDER BY column_name 2;

Suppose there is a table of Reservation in Airlines where passengers are from different countries . Now you want to project no of passenger from every country on a particular journey date . Here you can achieve tour target by using GROUP BY command –

SELECT COUNT (passenge_id), country
FROM reservation
WHERE Journydate=’dd-mm-yyyy’
GROUP BY country;

If you want to make it more specific . Lets add some more complexity on it . If you need to SELECT only those country where no of passenger is greater than 1000 .Here is the query for this –

SELECT COUNT (passenge_id), country
FROM reservation
WHERE Journydate=’dd-mm-yyyy’
GROUP BY country

HAVING COUNT (passenge_id)>1000;

END NOTES-

So here I  end with essential for SQL in Data science . I have mentioned  it as best way to learn SQL for Data science because this article can polish your SQL skills in five minutes  . This will give 80% results in 20% efforts . If you need more assistance in understanding the requirement for SQL in Data science , Please go through the article Using SQL for Data Science : Know Why and How ?

Do not forget to make  connection with us by just simple subscription in Data science Learner Newsletter .  We help you to update with all recent happening in Data science world . You can also write your feed back to us about this article . We love to read the responses from reader side .

Data Science Learner Team.

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.

Meet Abhishek ( Chief Editor) , a data scientist with major expertise in NLP and Text Analytics. He has worked on various projects involving text data and have been able to achieve great results. He is currently manages Datasciencelearner.com, where he and his team share knowledge and help others learn more about data science.
 
Thank you For sharing.We appreciate your support. Don't Forget to LIKE and FOLLOW our SITE to keep UPDATED with Data Science Learner