13.2. SQL Query Primer#
13.2.1. Query Clauses#
SQL offers six clauses (Query Clauses) to query your data. All SQL queries will use at least two clauses, i.e., SELECT
and FROM
.
Clause Name |
Purpose |
---|---|
SELECT |
Determines which columns to include in the query’s result set |
FROM |
Identifies the tables from which to retrieve data and how the tables should be joined |
WHERE |
Filters out unwanted data |
GROUP BY |
Used to group rows together by common column values |
HAVING |
Filters out unwanted groups |
ORDER BY |
Sorts the rows of the final result set by one or more columns |
13.2.1.1. Basic query#
The most basic SQL query will have a SELECT
and FROM
clause. Select lets you choose the columns
you want. In case you want all the columns, you can use *
, which indicates to SQL you want all the columns.
The FROM
clause lets you specify the table you want to query. The following is the most basic SQL query:
You can select all the columns by using *
after the SELECT
. Note all SQL queries are terminated
by a semicolon (;
). You can format your SQL with as many spaces and tabs as you like. To indicate to SQL
that your query statement is complete, terminate it with a semicolon.
Query: Select all columns and all rows from the PatientCorePopulatedTable
table.
SELECT
*
FROM
PatientCorePopulatedTable;
By default all the rows are selected since no filtering is applied.
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
*
FROM
PatientCorePopulatedTable;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientID | PatientGender | PatientDateOfBirth | PatientRace | PatientMaritalStatus | PatientLanguage | PatientPopulationPercentageBelowPoverty | |
---|---|---|---|---|---|---|---|
0 | FB2ABB23-C9D0-4D09-8464-49BF0B982F0F | Male | 1947-12-28 02:45:40.547 | Unknown | Married | Icelandic | 18.080000 |
1 | 64182B95-EB72-4E2B-BE77-8050B71498CE | Male | 1952-01-18 19:51:12.917 | African American | Separated | English | 13.030000 |
2 | DB22A4D9-7E4D-485C-916A-9CD1386507FB | Female | 1970-07-25 13:04:20.717 | Asian | Married | English | 6.670000 |
3 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | Male | 1979-01-04 05:45:29.580 | White | Married | English | 16.090000 |
4 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | Female | 1921-04-11 11:39:49.197 | White | Married | English | 18.200000 |
5 | 7FD13988-E58A-4A5C-8680-89AC200950FA | Male | 1965-07-12 15:41:20.523 | White | Married | Spanish | 12.410000 |
6 | C60FE675-CA52-4C55-A233-F4B27E94987F | Male | 1957-10-30 23:26:15.303 | Asian | Married | Spanish | 12.800000 |
7 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | Female | 1935-11-03 21:07:09.040 | White | Married | English | 15.310000 |
8 | FA157FA5-F488-4884-BF87-E144630D595C | Female | 1932-11-01 06:19:56.577 | White | Single | English | 16.320000 |
9 | B7E9FC4C-5182-4A34-954E-CEF5FC07E96D | Female | 1985-12-11 02:48:16.907 | Unknown | Single | English | 11.430000 |
10 | 1A40AF35-C6D4-4D46-B475-A15D84E8A9D5 | Male | 1949-01-10 15:37:35.543 | White | Married | English | 11.250000 |
11 | DA6CECFF-DE13-4C4C-919F-64E1A2B76C9D | Male | 1924-06-27 19:37:58.823 | Asian | Divorced | English | 83.750000 |
12 | 0A9BA3E4-CF3C-49C4-9774-5EEA2EE7D123 | Male | 1980-04-04 07:59:15.820 | White | Single | Spanish | 14.280000 |
13 | 7C788499-7798-484B-A027-9FCDC4C0DADB | Male | 1926-08-13 10:22:16.247 | White | Married | English | 11.890000 |
14 | C54B5AAD-98E8-472D-BAA0-638D9F3BD024 | Female | 1972-11-13 22:50:43.600 | African American | Single | Unknown | 94.000000 |
15 | 6985D824-3269-4D12-A9DD-B932D640E26E | Female | 1967-10-27 03:58:29.027 | White | Married | English | 12.010000 |
16 | D8B53AA2-7953-4477-9EA4-68400EBAAC5C | Male | 1951-06-11 20:11:10.003 | African American | Unknown | English | 16.890000 |
17 | DB92CDC6-FA9B-4492-BC2C-0C588AD78956 | Male | 1977-06-30 01:46:05.467 | Unknown | Single | Icelandic | 1.700000 |
18 | 25B786AF-0F99-478C-9CFA-0EA607E45834 | Male | 1926-08-20 00:21:38.870 | White | Married | English | 93.600000 |
19 | F00C64F8-2033-4640-80FE-F1F62CBE26A5 | Female | 1957-06-04 18:31:01.177 | African American | Single | English | 15.720000 |
20 | B70E5A76-F2BC-41E4-B037-CD4D9ABA0967 | Male | 1962-12-15 01:59:01.737 | White | Single | Icelandic | 13.100000 |
21 | F0B53A2C-98CA-415D-B928-E3FD0E52B22A | Male | 1950-06-20 10:31:18.337 | African American | Married | English | 14.870000 |
22 | 1A8791E3-A61C-455A-8DEE-763EB90C9B2C | Male | 1973-08-16 10:58:34.413 | Asian | Single | English | 13.970000 |
23 | 81C5B13B-F6B2-4E57-9593-6E7E4C13B2CE | Female | 1930-05-28 02:59:42.857 | White | Married | Icelandic | 19.220000 |
24 | 220C8D43-1322-4A9D-B890-D426942A3649 | Male | 1971-03-13 18:31:24.877 | Asian | Married | English | 16.710000 |
25 | C242E3A4-E785-4DF1-A0E4-3B568DC88F2E | Male | 1963-02-23 21:46:28.840 | Asian | Single | Spanish | 7.590000 |
26 | 0BC491C5-5A45-4067-BD11-A78BEA00D3BE | Female | 1921-04-18 01:56:01.807 | Unknown | Married | English | 18.050000 |
27 | 79A7BA2A-D35A-4CB8-A835-6BAA13B0058C | Female | 1952-05-03 02:30:21.563 | White | Single | Spanish | 16.240000 |
28 | FFCDECD6-4048-4DCB-B910-1218160005B3 | Male | 1941-05-06 14:56:42.687 | White | Single | English | 14.490000 |
29 | CD2ADB1B-97F7-4EF6-BC5C-3E0EC562A06F | Female | 1963-04-16 08:54:08.647 | White | Single | Icelandic | 11.160000 |
30 | A50BE9B4-8A0B-4169-B894-F7BD86D7D90B | Female | 1987-04-18 16:31:16.333 | Asian | Divorced | English | 14.070000 |
31 | 69CC25ED-A54A-4BAF-97E3-774BB3C9DED1 | Female | 1974-10-13 14:51:28.997 | Unknown | Married | English | 17.720000 |
32 | 69B5D2A0-12FD-46EF-A5FF-B29C4BAFBE49 | Male | 1954-05-02 16:38:15.957 | White | Married | English | 18.650000 |
33 | 8D389A8C-A6D8-4447-9DDE-1A28AB4EC667 | Female | 1964-07-10 08:50:13.707 | Asian | Married | English | 4.340000 |
34 | 967987B9-FFEF-4776-85CF-AE05CA81F583 | Male | 1947-02-13 10:26:55.520 | White | Widowed | English | 88.660000 |
35 | 9BBF3A51-443D-438B-9289-B98B8E0577C0 | Male | 1944-11-25 06:12:56.860 | White | Married | English | 16.760000 |
36 | 6D8008ED-D623-4BE4-B93B-335F9797C170 | Female | 1978-03-14 13:41:28.337 | White | Married | Icelandic | 14.810000 |
37 | B3892204-880B-40EF-B3BB-B824B50E99E5 | Male | 1978-08-06 08:38:27.330 | White | Single | Spanish | 16.860000 |
38 | 49DADA25-F2C2-42BB-8210-D78E6C7B0D48 | Male | 1962-04-08 10:18:26.263 | African American | Married | English | 19.500000 |
39 | 7548B6CF-79D9-461D-A0C5-20B861406FAC | Male | 1956-11-13 14:08:00.253 | White | Married | Icelandic | 14.030000 |
40 | B5D31F01-7273-4901-B56F-8139769A11EF | Female | 1971-04-02 18:56:59.353 | Asian | Unknown | Unknown | 82.080000 |
41 | 80D356B4-F974-441F-A5F2-F95986D119A2 | Female | 1938-03-06 18:24:18.297 | White | Single | English | 18.880000 |
42 | A0A976C8-9B30-4492-B8C4-5B25095B9192 | Male | 1931-05-26 14:54:15.847 | Asian | Single | English | 19.520000 |
43 | 016A590E-D093-4667-A5DA-D68EA6987D93 | Male | 1960-12-06 06:37:05.640 | White | Unknown | English | 15.020000 |
44 | 3B11D6B3-A36A-4B69-A437-C29BF425A941 | Female | 1954-05-18 10:19:19.110 | Asian | Separated | English | 10.800000 |
45 | E483DE6E-D4E6-47FD-905B-22EE86EC7ACE | Male | 1945-11-18 04:14:31.573 | African American | Single | English | 16.880000 |
46 | B2EB15FA-5431-4804-9309-4215BDC778C0 | Male | 1951-04-13 20:14:02.953 | White | Divorced | English | 19.410000 |
47 | 53B9FFDD-F80B-43BE-93CF-C34A023EE7E9 | Male | 1921-08-22 19:17:09.227 | White | Divorced | Icelandic | 18.170000 |
48 | C5D09468-574F-4802-B56F-DB38F4EB1687 | Male | 1939-07-07 19:39:49.753 | African American | Married | Icelandic | 10.870000 |
49 | EA7C2F0F-DA1C-4CE8-9700-4BB1FC7AF3FB | Male | 1969-01-05 00:15:09.627 | Unknown | Married | English | 3.300000 |
50 | 35FE7491-1A1D-48CB-810C-8DC2599AB3DD | Male | 1969-11-02 06:34:34.527 | White | Married | English | 4.510000 |
51 | 868E700E-3C56-458F-A477-078D671DCB20 | Female | 1978-09-21 15:29:44.770 | African American | Married | Icelandic | 5.940000 |
52 | 56A35E74-90BE-44A0-B7BA-7743BB152133 | Female | 1968-11-09 15:29:19.557 | White | Married | English | 8.810000 |
53 | CC12B481-B516-455B-884F-4CA900B29F2E | Female | 1985-10-21 07:59:04.777 | Unknown | Divorced | English | 14.580000 |
54 | 714823AF-C52C-414C-B53B-C43EACD194C3 | Male | 1952-05-08 23:51:50.127 | White | Married | English | 13.230000 |
55 | 3231F930-2978-4F50-8234-755449851E7B | Male | 1979-05-26 04:58:10.627 | White | Single | English | 18.360000 |
56 | C2CCB1AB-6633-4CB3-B4E8-157E6FB02376 | Female | 1964-05-07 10:20:37.740 | White | Single | Spanish | 15.980000 |
57 | 1311FEE4-2FDC-46E4-83D3-1550A3E51D2C | Female | 1988-03-28 03:09:22.807 | White | Single | English | 14.990000 |
58 | 21792512-2D40-4326-BEA2-A40127EB24FF | Male | 1938-03-24 19:25:53.980 | White | Single | Unknown | 89.440000 |
59 | EEAFC0B3-B835-4D99-AB33-2F9428E54E5F | Female | 1961-01-08 15:19:15.490 | Asian | Unknown | English | 18.950000 |
60 | 2EE42DEF-37CA-4694-827E-FA4EAF882BFC | Male | 1964-04-27 00:41:40.410 | White | Married | English | 18.910000 |
61 | 36775002-9EC3-4889-AD4F-80DC6855C8D8 | Female | 1963-07-16 22:16:48.477 | Asian | Single | Spanish | 15.560000 |
62 | 4C201C71-CCED-40D1-9642-F9C8C485B854 | Male | 1926-09-22 09:17:14.713 | African American | Married | English | 10.300000 |
63 | 2A8772FE-61DB-483E-B6BF-6C0A74BA9C2A | Female | 1966-10-14 15:31:48.427 | Asian | Single | English | 11.150000 |
64 | C65A4ADE-112E-49E4-B72A-0DED22C242ED | Male | 1955-04-07 18:55:17.077 | White | Married | English | 19.660000 |
65 | 80AC01B2-BD55-4BE0-A59A-4024104CF4E9 | Female | 1965-05-14 15:25:09.740 | African American | Married | English | 19.740000 |
66 | 6A57AC0C-57F3-4C19-98A1-51135EFBC4FF | Female | 1958-07-07 16:00:19.383 | Unknown | Married | English | 94.060000 |
67 | 7A025E77-7832-4F53-B9A7-09A3F98AC17E | Female | 1951-07-12 08:23:45.520 | Asian | Single | English | 16.690000 |
68 | DCE5AEB8-6DB9-4106-8AE4-02CCC5C23741 | Female | 1967-11-09 08:14:51.143 | White | Single | Unknown | 16.430000 |
69 | 886B5885-1EE2-49F3-98D5-A2F02EB8A9D4 | Female | 1970-04-27 09:15:11.080 | Unknown | Married | English | 10.350000 |
70 | 0E0EADE8-5592-4E0B-9F88-D7596E32EE08 | Female | 1942-07-13 21:15:37.517 | White | Married | Spanish | 18.800000 |
71 | 0681FA35-A794-4684-97BD-00B88370DB41 | Female | 1978-10-02 21:46:05.300 | Asian | Unknown | Spanish | 19.160000 |
72 | 2E26695A-EFB0-4C7F-9318-E3030B154E39 | Female | 1963-06-28 05:37:36.843 | Asian | Married | English | 14.200000 |
73 | DDC0BC57-7A4E-4E02-9282-177750B74FBC | Male | 1921-03-26 14:38:51.803 | White | Single | English | 18.410000 |
74 | E250799D-F6DE-4914-ADB4-B08A6E5029B9 | Female | 1945-08-04 19:03:00.757 | White | Single | Unknown | 12.860000 |
75 | 9C75DF1F-9DA6-4C98-8F5B-E10BDC805ED0 | Female | 1952-07-19 13:35:56.833 | White | Divorced | English | 88.650000 |
76 | A19A0B00-4C9A-4206-B1FE-17E6DA3CEB0B | Female | 1967-07-01 01:44:24.907 | Unknown | Married | English | 17.840000 |
77 | E5478913-6819-4977-BB11-4C8B61175B56 | Female | 1980-03-21 13:08:56.867 | White | Single | English | 12.920000 |
78 | 66154E24-D3EE-4311-89DB-6195278F9B3C | Female | 1944-08-26 13:03:24.297 | White | Divorced | English | 19.490000 |
79 | BC44CE19-9FC5-4AC9-A296-9EBC5E3D03AE | Female | 1953-06-04 03:16:17.843 | African American | Married | English | 15.040000 |
80 | CC9CDA72-B37A-4F8F-AFE4-B08F56A183BE | Female | 1942-04-14 14:01:01.130 | Asian | Married | Spanish | 14.150000 |
81 | EA9B67E2-15F0-450B-A8A6-14F6E4AE3D12 | Female | 1930-04-08 20:59:31.057 | Asian | Separated | Icelandic | 98.400000 |
82 | 3E462A8F-7B90-43A1-A8B6-AD82CB5002C9 | Female | 1986-12-20 16:41:34.793 | White | Married | Spanish | 18.330000 |
83 | 9E18822E-7D13-45C7-B50E-F95CFF92BC3E | Male | 1954-10-16 06:45:56.257 | Unknown | Single | English | 15.240000 |
84 | A7142B71-A144-4D56-BD14-3E966B01DB37 | Male | 1960-03-28 13:40:52.270 | White | Divorced | Spanish | 19.520000 |
85 | FE0B9B59-1927-45B7-8556-E079DC1DE30A | Male | 1955-08-23 01:01:22.260 | White | Divorced | English | 6.480000 |
86 | 98F593D2-8894-49BB-93B9-5A0E2CF85E2E | Female | 1944-07-15 19:04:11.487 | African American | Married | English | 9.100000 |
87 | 1A220558-5996-43E1-AE5D-7B96180FED35 | Male | 1937-09-07 22:23:53.143 | Asian | Married | English | 15.110000 |
88 | 7A7332AD-88B1-4848-9356-E5260E477C59 | Female | 1944-12-01 06:30:01.543 | Unknown | Married | English | 19.460000 |
89 | 6D5DCAC1-17FE-4D7C-923B-806EFBA3E6DF | Male | 1961-10-22 02:47:13.170 | Asian | Married | English | 13.900000 |
90 | 36E2F89E-777A-4D77-9D95-0D70A8AB416F | Male | 1980-05-30 13:23:50.703 | African American | Separated | English | 19.360000 |
91 | 672D554B-D6D1-40B2-A6A4-21A4CB6B1AA6 | Male | 1983-06-17 04:46:13.753 | African American | Single | Icelandic | 11.040000 |
92 | 03A481F5-B32A-4A91-BD42-43EB78FEBA77 | Female | 1968-02-07 23:02:38.017 | Asian | Single | Unknown | 93.230000 |
93 | 2A5251B1-0945-47FA-A65C-7A6381562591 | Female | 1942-01-07 16:45:33.060 | Unknown | Married | Spanish | 18.050000 |
94 | 8AF47463-8534-4203-B210-C2290F6CE689 | Female | 1952-06-27 17:29:04.187 | White | Divorced | English | 11.880000 |
95 | 135C831F-7DA5-46C0-959C-EBCBD8810B43 | Male | 1971-05-13 04:40:05.623 | White | Unknown | Spanish | 12.380000 |
96 | 8856096E-E59C-4156-A767-C091AF799C80 | Female | 1988-11-25 02:59:36.373 | White | Divorced | English | 11.080000 |
97 | 6623F5D6-D581-4268-9F9B-21612FBBF7B5 | Female | 1943-02-17 15:36:13.787 | Asian | Single | Spanish | 14.490000 |
98 | 65A7FBE0-EA9F-49E9-9824-D8F3AD98DAC0 | Female | 1962-11-30 06:28:33.110 | White | Separated | Spanish | 17.980000 |
99 | FB909FAE-72DD-4F6F-9828-D92183DF185F | Male | 1940-07-15 12:18:41.080 | White | Single | Spanish | 14.900000 |
13.2.1.2. Limit rows#
You can limit the number of rows to 10 by adding LIMIT 10
after the FROM
clause.
Query: Select all columns and from the PatientCorePopulatedTable
table and limit to 10 rows.
SELECT
*
FROM
PatientCorePopulatedTable
LIMIT 10;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
*
FROM
PatientCorePopulatedTable
LIMIT 10;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientID | PatientGender | PatientDateOfBirth | PatientRace | PatientMaritalStatus | PatientLanguage | PatientPopulationPercentageBelowPoverty | |
---|---|---|---|---|---|---|---|
0 | FB2ABB23-C9D0-4D09-8464-49BF0B982F0F | Male | 1947-12-28 02:45:40.547 | Unknown | Married | Icelandic | 18.080000 |
1 | 64182B95-EB72-4E2B-BE77-8050B71498CE | Male | 1952-01-18 19:51:12.917 | African American | Separated | English | 13.030000 |
2 | DB22A4D9-7E4D-485C-916A-9CD1386507FB | Female | 1970-07-25 13:04:20.717 | Asian | Married | English | 6.670000 |
3 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | Male | 1979-01-04 05:45:29.580 | White | Married | English | 16.090000 |
4 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | Female | 1921-04-11 11:39:49.197 | White | Married | English | 18.200000 |
5 | 7FD13988-E58A-4A5C-8680-89AC200950FA | Male | 1965-07-12 15:41:20.523 | White | Married | Spanish | 12.410000 |
6 | C60FE675-CA52-4C55-A233-F4B27E94987F | Male | 1957-10-30 23:26:15.303 | Asian | Married | Spanish | 12.800000 |
7 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | Female | 1935-11-03 21:07:09.040 | White | Married | English | 15.310000 |
8 | FA157FA5-F488-4884-BF87-E144630D595C | Female | 1932-11-01 06:19:56.577 | White | Single | English | 16.320000 |
9 | B7E9FC4C-5182-4A34-954E-CEF5FC07E96D | Female | 1985-12-11 02:48:16.907 | Unknown | Single | English | 11.430000 |
13.2.1.3. Select some columns#
You can select columns from a table by specifying them after the SELECT
clause. Multiple
columns are separated by a comma (,
).
Query: Select the PatientID and PatientDateOfBirth columns and limit to 10 rows.
SELECT
PatientID,
PatientDateOfBirth
FROM
PatientCorePopulatedTable
LIMIT 10;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientID,
PatientDateOfBirth
FROM
PatientCorePopulatedTable
LIMIT 10;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientID | PatientDateOfBirth | |
---|---|---|
0 | FB2ABB23-C9D0-4D09-8464-49BF0B982F0F | 1947-12-28 02:45:40.547 |
1 | 64182B95-EB72-4E2B-BE77-8050B71498CE | 1952-01-18 19:51:12.917 |
2 | DB22A4D9-7E4D-485C-916A-9CD1386507FB | 1970-07-25 13:04:20.717 |
3 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | 1979-01-04 05:45:29.580 |
4 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | 1921-04-11 11:39:49.197 |
5 | 7FD13988-E58A-4A5C-8680-89AC200950FA | 1965-07-12 15:41:20.523 |
6 | C60FE675-CA52-4C55-A233-F4B27E94987F | 1957-10-30 23:26:15.303 |
7 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | 1935-11-03 21:07:09.040 |
8 | FA157FA5-F488-4884-BF87-E144630D595C | 1932-11-01 06:19:56.577 |
9 | B7E9FC4C-5182-4A34-954E-CEF5FC07E96D | 1985-12-11 02:48:16.907 |
13.2.1.4. Using column alias#
Notice that all the column names are prefixed by Patient
. You can give columns a different name, meaning an alias.
There are two ways to do alias. You can put the alias name right after the actual column name, e.g., PatientID PTID
or you can use the AS
keyword to indicate explicitly that you are aliasing a column name, e.g., PatientID AS PTID
.
Note that if the alias has a space, then it should be in quotes.
Query: Select the PatientID and PatientDateOfBirth columns, but alias PatientID to PTID and PatientDateOfBirth to “Date of Birth” and limit to 10 rows.
SELECT
PatientID PTID,
PatientDateOfBirth AS "Date of Birth"
FROM
PatientCorePopulatedTable
LIMIT 10;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientID PTID,
PatientDateOfBirth AS "Date of Birth"
FROM
PatientCorePopulatedTable
LIMIT 10;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PTID | Date of Birth | |
---|---|---|
0 | FB2ABB23-C9D0-4D09-8464-49BF0B982F0F | 1947-12-28 02:45:40.547 |
1 | 64182B95-EB72-4E2B-BE77-8050B71498CE | 1952-01-18 19:51:12.917 |
2 | DB22A4D9-7E4D-485C-916A-9CD1386507FB | 1970-07-25 13:04:20.717 |
3 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | 1979-01-04 05:45:29.580 |
4 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | 1921-04-11 11:39:49.197 |
5 | 7FD13988-E58A-4A5C-8680-89AC200950FA | 1965-07-12 15:41:20.523 |
6 | C60FE675-CA52-4C55-A233-F4B27E94987F | 1957-10-30 23:26:15.303 |
7 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | 1935-11-03 21:07:09.040 |
8 | FA157FA5-F488-4884-BF87-E144630D595C | 1932-11-01 06:19:56.577 |
9 | B7E9FC4C-5182-4A34-954E-CEF5FC07E96D | 1985-12-11 02:48:16.907 |
13.2.1.5. Adding columns not from the table#
Besides selecting columns in the table, you can also add the following columns to your query:
literals such as numbers or strings
Math expressions such as
PatientPopulationPercentageBelowPoverty + 1
, orPatientPopulationPercentageBelowPoverty * 100
Query: Select the PatientID, Hospital, and PatientPopulationPercentageBelowPoverty columns, alias PatientID to PTID,
make the Hospital
column ‘Buffalo General’, multiply PatientPopulationPercentageBelowPoverty
by 10 and alias it to
Poverty Level
, and limit to 10 rows. Note that if the alias name has a space, then the name needs to inside of quotes.
SELECT
PatientID PTID,
'Buffalo General' Hospital,
PatientPopulationPercentageBelowPoverty * 10 "Poverty Level"
FROM
PatientCorePopulatedTable
LIMIT 10;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientID PTID,
'Buffalo General' Hospital,
PatientPopulationPercentageBelowPoverty * 10 "Poverty Level"
FROM
PatientCorePopulatedTable
LIMIT 10;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PTID | Hospital | Poverty Level | |
---|---|---|---|
0 | FB2ABB23-C9D0-4D09-8464-49BF0B982F0F | Buffalo General | 180.800000 |
1 | 64182B95-EB72-4E2B-BE77-8050B71498CE | Buffalo General | 130.300000 |
2 | DB22A4D9-7E4D-485C-916A-9CD1386507FB | Buffalo General | 66.700000 |
3 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | Buffalo General | 160.900000 |
4 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | Buffalo General | 182.000000 |
5 | 7FD13988-E58A-4A5C-8680-89AC200950FA | Buffalo General | 124.100000 |
6 | C60FE675-CA52-4C55-A233-F4B27E94987F | Buffalo General | 128.000000 |
7 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | Buffalo General | 153.100000 |
8 | FA157FA5-F488-4884-BF87-E144630D595C | Buffalo General | 163.200000 |
9 | B7E9FC4C-5182-4A34-954E-CEF5FC07E96D | Buffalo General | 114.300000 |
13.2.1.6. Removing duplicates#
In some cases you might get duplicate rows. You remove these duplicate rows by putting the DISTINCT
keyword
after the SELECT
keyword. One use of this is to get distinct values of a given column.
Query: Select only the PatientMaritalStatus
column from the PatientCorePopulatedTable
table.
SELECT
PatientMaritalStatus
FROM
PatientCorePopulatedTable;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus
FROM
PatientCorePopulatedTable;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | |
---|---|
0 | Married |
1 | Separated |
2 | Married |
3 | Married |
4 | Married |
5 | Married |
6 | Married |
7 | Married |
8 | Single |
9 | Single |
10 | Married |
11 | Divorced |
12 | Single |
13 | Married |
14 | Single |
15 | Married |
16 | Unknown |
17 | Single |
18 | Married |
19 | Single |
20 | Single |
21 | Married |
22 | Single |
23 | Married |
24 | Married |
25 | Single |
26 | Married |
27 | Single |
28 | Single |
29 | Single |
30 | Divorced |
31 | Married |
32 | Married |
33 | Married |
34 | Widowed |
35 | Married |
36 | Married |
37 | Single |
38 | Married |
39 | Married |
40 | Unknown |
41 | Single |
42 | Single |
43 | Unknown |
44 | Separated |
45 | Single |
46 | Divorced |
47 | Divorced |
48 | Married |
49 | Married |
50 | Married |
51 | Married |
52 | Married |
53 | Divorced |
54 | Married |
55 | Single |
56 | Single |
57 | Single |
58 | Single |
59 | Unknown |
60 | Married |
61 | Single |
62 | Married |
63 | Single |
64 | Married |
65 | Married |
66 | Married |
67 | Single |
68 | Single |
69 | Married |
70 | Married |
71 | Unknown |
72 | Married |
73 | Single |
74 | Single |
75 | Divorced |
76 | Married |
77 | Single |
78 | Divorced |
79 | Married |
80 | Married |
81 | Separated |
82 | Married |
83 | Single |
84 | Divorced |
85 | Divorced |
86 | Married |
87 | Married |
88 | Married |
89 | Married |
90 | Separated |
91 | Single |
92 | Single |
93 | Married |
94 | Divorced |
95 | Unknown |
96 | Divorced |
97 | Single |
98 | Separated |
99 | Single |
Query: Select the PatientMaritalStatus
column from the PatientCorePopulatedTable
table but
only select distinct values.
SELECT DISTINCT
PatientMaritalStatus
FROM
PatientCorePopulatedTable;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT DISTINCT
PatientMaritalStatus
FROM
PatientCorePopulatedTable;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | |
---|---|
0 | Married |
1 | Separated |
2 | Single |
3 | Divorced |
4 | Unknown |
5 | Widowed |
13.2.1.7. Removing duplicates with multiple columns#
The DISTINCT
keyword can also be used to find distinct combination of columns. It is also used sometimes with joins
to remove duplicate rows.
Query: Select PatientRace
and PatientMaritalStatus
columns from the PatientCorePopulatedTable
table but
only select distinct values. This query finds the distinct combinations of race and martial status.
SELECT DISTINCT
PatientRace,
PatientMaritalStatus
FROM
PatientCorePopulatedTable
ORDER BY PatientRace, PatientMaritalStatus
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT DISTINCT
PatientRace,
PatientMaritalStatus
FROM
PatientCorePopulatedTable
ORDER BY PatientRace, PatientMaritalStatus
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientRace | PatientMaritalStatus | |
---|---|---|
0 | African American | Married |
1 | African American | Separated |
2 | African American | Single |
3 | African American | Unknown |
4 | Asian | Divorced |
5 | Asian | Married |
6 | Asian | Separated |
7 | Asian | Single |
8 | Asian | Unknown |
9 | Unknown | Divorced |
10 | Unknown | Married |
11 | Unknown | Single |
12 | White | Divorced |
13 | White | Married |
14 | White | Separated |
15 | White | Single |
16 | White | Unknown |
17 | White | Widowed |
13.2.1.8. FROM
multiple tables#
The power of SQL lies in the fact that you can combine tables together based on some shared column between
tables. The FROM
clause allows you to select from multiple tables.
You should now that there are four types of tables in SQL:
Permanent tables (i.e., created using the
CREATE TABLE
statement)Derived tables (i.e., rows returned by a subquery and held in memory)
Temporary tables (i.e., volatile data held in memory)
Virtual tables (i.e, created using the
CREATE VIEW
statement)
We have been using permanent tables so far. Temporary and virtual tables will be covered later. The following is an example a derived table.
13.2.1.8.1. Derived table#
A derived query is a query held in memory. You surround it a pair of parenthesis and give it a name.
Query: Create a subquery called dx_codes
which selects the PrimaryDiagnosisCode
and PrimaryDiagnosisDescription
columns from the AdmissionsDiagnosesCorePopulatedTable
where the AdmissionID
is equal to 1
. Then use this derived query in another query that concatenates the diagnosis code and diagnosis description, e.g., “(M01.X) Direct infection of joint in infectious and parasitic diseases classified elsewhere”. Fields and string literals can be concatenated in SQLite using ||
.
SELECT '(' || dx_codes.code || ') ' || dx_codes.description AS CodeWDescription
FROM
(
SELECT
PrimaryDiagnosisCode code,
PrimaryDiagnosisDescription description
FROM
AdmissionsDiagnosesCorePopulatedTable
WHERE AdmissionID = 1
) dx_codes
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT '(' || dx_codes.code || ') ' || dx_codes.description AS CodeWDescription
FROM
(
SELECT
PrimaryDiagnosisCode code,
PrimaryDiagnosisDescription description
FROM
AdmissionsDiagnosesCorePopulatedTable
WHERE AdmissionID = 1
) dx_codes
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
CodeWDescription | |
---|---|
0 | (C91.00) Acute lymphoblastic leukemia not having achieved remission |
1 | (M05.442) Rheumatoid myopathy with rheumatoid arthritis of left hand |
2 | (K91) Intraoperative and postprocedural complications and disorders of digestive system, not elsewhere classified |
3 | (M05.752) Rheumatoid arthritis with rheumatoid factor of left hip without organ or systems involvement |
4 | (C94.6) Myelodysplastic disease, not classified |
5 | (C92.51) Acute myelomonocytic leukemia, in remission |
6 | (D12.7) Benign neoplasm of rectosigmoid junction |
7 | (C79.72) Secondary malignant neoplasm of left adrenal gland |
8 | (M06.31) Rheumatoid nodule, shoulder |
9 | (Z13.810) Encounter for screening for upper gastrointestinal disorder |
10 | (Z13.83) Encounter for screening for respiratory disorder NEC |
11 | (D39.12) Neoplasm of uncertain behavior of left ovary |
12 | (O99.41) Diseases of the circulatory system complicating pregnancy |
13 | (O07.32) Renal failure following failed attempted termination of pregnancy |
14 | (E10.630) Type 1 diabetes mellitus with periodontal disease |
15 | (F11) Opioid related disorders |
16 | (D13.1) Benign neoplasm of stomach |
17 | (F31.4) Bipolar disorder, current episode depressed, severe, without psychotic features |
18 | (O10.113) Pre-existing hypertensive heart disease complicating pregnancy, third trimester |
19 | (N49.2) Inflammatory disorders of scrotum |
20 | (I25.82) Chronic total occlusion of coronary artery |
21 | (C51.2) Malignant neoplasm of clitoris |
22 | (H75.01) Mastoiditis in infectious and parasitic diseases classified elsewhere, right ear |
23 | (M01.X5) Direct infection of hip in infectious and parasitic diseases classified elsewhere |
24 | (E10.339) Type 1 diabetes mellitus with moderate nonproliferative diabetic retinopathy without macular edema |
25 | (C19) Malignant neoplasm of rectosigmoid junction |
26 | (I67.81) Acute cerebrovascular insufficiency |
27 | (C92.50) Acute myelomonocytic leukemia, not having achieved remission |
28 | (I25.84) Coronary atherosclerosis due to calcified coronary lesion |
29 | (M63) Disorders of muscle in diseases classified elsewhere |
30 | (O9A.1) Malignant neoplasm complicating pregnancy, childbirth and the puerperium |
31 | (C75.1) Malignant neoplasm of pituitary gland |
32 | (M11.071) Hydroxyapatite deposition disease, right ankle and foot |
33 | (F01.5) Vascular dementia |
34 | (G47) Sleep disorders |
35 | (G96.1) Disorders of meninges, not elsewhere classified |
36 | (C91.5) Adult T-cell lymphoma/leukemia (HTLV-1-associated) |
37 | (Z91.15) Patient's noncompliance with renal dialysis |
38 | (C92.12) Chronic myeloid leukemia, BCR/ABL-positive, in relapse |
39 | (M05.742) Rheumatoid arthritis with rheumatoid factor of left hand without organ or systems involvement |
40 | (J66.1) Flax-dressers' disease |
41 | (N18.1) Chronic kidney disease, stage 1 |
42 | (C88.3) Immunoproliferative small intestinal disease |
43 | (Z12) Encounter for screening for malignant neoplasms |
44 | (M05.261) Rheumatoid vasculitis with rheumatoid arthritis of right knee |
45 | (B40.1) Chronic pulmonary blastomycosis |
46 | (H34) Retinal vascular occlusions |
47 | (M24.251) Disorder of ligament, right hip |
48 | (O99.712) Diseases of the skin and subcutaneous tissue complicating pregnancy, second trimester |
49 | (I79.0) Aneurysm of aorta in diseases classified elsewhere |
50 | (A98.3) Marburg virus disease |
51 | (M05.722) Rheumatoid arthritis with rheumatoid factor of left elbow without organ or systems involvement |
52 | (H16.433) Localized vascularization of cornea, bilateral |
53 | (N25.1) Nephrogenic diabetes insipidus |
54 | (M02.37) Reiter's disease, ankle and foot |
55 | (M10.31) Gout due to renal impairment, shoulder |
56 | (I25.110) Atherosclerotic heart disease of native coronary artery with unstable angina pectoris |
57 | (E72.4) Disorders of ornithine metabolism |
58 | (O24.111) Pre-existing diabetes mellitus, type 2, in pregnancy, first trimester |
59 | (Z13.0) Encounter for screening for diseases of the blood and blood-forming organs and certain disorders involving the immune mechanism |
60 | (M24.22) Disorder of ligament, elbow |
61 | (C03.0) Malignant neoplasm of upper gum |
62 | (Z12.2) Encounter for screening for malignant neoplasm of respiratory organs |
63 | (C69.61) Malignant neoplasm of right orbit |
64 | (O10.213) Pre-existing hypertensive chronic kidney disease complicating pregnancy, third trimester |
65 | (M90.51) Osteonecrosis in diseases classified elsewhere, shoulder |
66 | (F63.81) Intermittent explosive disorder |
67 | (O10.13) Pre-existing hypertensive heart disease complicating the puerperium |
68 | (C94.22) Acute megakaryoblastic leukemia, in relapse |
69 | (C11) Malignant neoplasm of nasopharynx |
70 | (I43) Cardiomyopathy in diseases classified elsewhere |
71 | (E09.359) Drug or chemical induced diabetes mellitus with proliferative diabetic retinopathy without macular edema |
72 | (M90.6) Osteitis deformans in neoplastic diseases |
73 | (C72.1) Malignant neoplasm of cauda equina |
74 | (F12.180) Cannabis abuse with cannabis-induced anxiety disorder |
75 | (C18.3) Malignant neoplasm of hepatic flexure |
76 | (M05.272) Rheumatoid vasculitis with rheumatoid arthritis of left ankle and foot |
77 | (E10.32) Type 1 diabetes mellitus with mild nonproliferative diabetic retinopathy |
78 | (M06.361) Rheumatoid nodule, right knee |
79 | (H26.21) Cataract with neovascularization |
80 | (Z12.13) Encounter for screening for malignant neoplasm of small intestine |
81 | (D35.5) Benign neoplasm of carotid body |
82 | (D16.2) Benign neoplasm of long bones of lower limb |
83 | (M06.041) Rheumatoid arthritis without rheumatoid factor, right hand |
84 | (M05.732) Rheumatoid arthritis with rheumatoid factor of left wrist without organ or systems involvement |
85 | (M10.37) Gout due to renal impairment, ankle and foot |
86 | (C67.5) Malignant neoplasm of bladder neck |
87 | (M10.361) Gout due to renal impairment, right knee |
88 | (K08.121) Complete loss of teeth due to periodontal diseases, class I |
89 | (E30) Disorders of puberty, not elsewhere classified |
90 | (G71) Primary disorders of muscles |
91 | (C18.3) Malignant neoplasm of hepatic flexure |
92 | (O99.351) Diseases of the nervous system complicating pregnancy, first trimester |
93 | (M05.25) Rheumatoid vasculitis with rheumatoid arthritis of hip |
94 | (C57.21) Malignant neoplasm of right round ligament |
95 | (K00) Disorders of tooth development and eruption |
96 | (M63.832) Disorders of muscle in diseases classified elsewhere, left forearm |
97 | (C72.3) Malignant neoplasm of optic nerve |
98 | (M05.27) Rheumatoid vasculitis with rheumatoid arthritis of ankle and foot |
99 | (F06.1) Catatonic disorder due to known physiological condition |
13.2.1.9. The WHERE
Clause#
The WHERE
clause allows you to filter out unwanted rows. For string fields, you can use the equality operator (=
) or
the LIKE
operator. For numerical and date fields, you can use all the usually operators such as greater than, less than, etc.
WHERE
clauses can be combined using AND
and OR
. Parenthesis can be used to clarify grouping of the clauses. The
WHERE
clauses are put after the FROM
clause.
Query: Select all patients from PatientCorePopulatedTable
table that are either married and african american or married and white and
the PatientPopulationPercentageBelowPoverty
is above 15. Select the following columns: PatientID, PatientRace, PatientMaritalStatus and
PatientPopulationPercentageBelowPoverty.
SELECT
PatientID,
PatientRace,
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
WHERE ((PatientRace = 'White' AND PatientMaritalStatus = 'Married') OR (PatientRace = 'African American' AND PatientMaritalStatus = 'Married'))
AND PatientPopulationPercentageBelowPoverty > 15
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientID,
PatientRace,
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
WHERE (
(
PatientRace = 'White' AND PatientMaritalStatus = 'Married')
OR (PatientRace = 'African American' AND PatientMaritalStatus = 'Married')
)
AND PatientPopulationPercentageBelowPoverty > 15
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientID | PatientRace | PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|---|---|
0 | 6E70D84D-C75F-477C-BC37-9177C3698C66 | White | Married | 16.090000 |
1 | C8556CC0-32FC-4CA5-A8CD-9CCF38816167 | White | Married | 18.200000 |
2 | B39DC5AC-E003-4E6A-91B6-FC07625A1285 | White | Married | 15.310000 |
3 | 25B786AF-0F99-478C-9CFA-0EA607E45834 | White | Married | 93.600000 |
4 | 81C5B13B-F6B2-4E57-9593-6E7E4C13B2CE | White | Married | 19.220000 |
5 | 69B5D2A0-12FD-46EF-A5FF-B29C4BAFBE49 | White | Married | 18.650000 |
6 | 9BBF3A51-443D-438B-9289-B98B8E0577C0 | White | Married | 16.760000 |
7 | 49DADA25-F2C2-42BB-8210-D78E6C7B0D48 | African American | Married | 19.500000 |
8 | 2EE42DEF-37CA-4694-827E-FA4EAF882BFC | White | Married | 18.910000 |
9 | C65A4ADE-112E-49E4-B72A-0DED22C242ED | White | Married | 19.660000 |
10 | 80AC01B2-BD55-4BE0-A59A-4024104CF4E9 | African American | Married | 19.740000 |
11 | 0E0EADE8-5592-4E0B-9F88-D7596E32EE08 | White | Married | 18.800000 |
12 | BC44CE19-9FC5-4AC9-A296-9EBC5E3D03AE | African American | Married | 15.040000 |
13 | 3E462A8F-7B90-43A1-A8B6-AD82CB5002C9 | White | Married | 18.330000 |
13.2.1.10. The ORDER BY
Clauses#
You can order the the rows by column(s) using the ORDER BY
clause. This clause is put after the WHERE
clause. You can specify
multiple columns separated by comma. You can also specify ascending order using the ASC
keyword after the column name
and descending order by using the DESC
keyword. The default sorting order is ascending. A shortcut for descending is putting -
before the column name. Finally
you can sort the columns by its numerical position.
13.2.1.10.1. Sort by columns#
Query: Select the PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
columns from
the PatientCorePopulatedTable
table and sort by PatientPopulationPercentageBelowPoverty
.
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientPopulationPercentageBelowPoverty;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientPopulationPercentageBelowPoverty;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|
0 | Single | 1.700000 |
1 | Married | 3.300000 |
2 | Married | 4.340000 |
3 | Married | 4.510000 |
4 | Married | 5.940000 |
5 | Divorced | 6.480000 |
6 | Married | 6.670000 |
7 | Single | 7.590000 |
8 | Married | 8.810000 |
9 | Married | 9.100000 |
10 | Married | 10.300000 |
11 | Married | 10.350000 |
12 | Separated | 10.800000 |
13 | Married | 10.870000 |
14 | Single | 11.040000 |
15 | Divorced | 11.080000 |
16 | Single | 11.150000 |
17 | Single | 11.160000 |
18 | Married | 11.250000 |
19 | Single | 11.430000 |
20 | Divorced | 11.880000 |
21 | Married | 11.890000 |
22 | Married | 12.010000 |
23 | Unknown | 12.380000 |
24 | Married | 12.410000 |
25 | Married | 12.800000 |
26 | Single | 12.860000 |
27 | Single | 12.920000 |
28 | Separated | 13.030000 |
29 | Single | 13.100000 |
30 | Married | 13.230000 |
31 | Married | 13.900000 |
32 | Single | 13.970000 |
33 | Married | 14.030000 |
34 | Divorced | 14.070000 |
35 | Married | 14.150000 |
36 | Married | 14.200000 |
37 | Single | 14.280000 |
38 | Single | 14.490000 |
39 | Single | 14.490000 |
40 | Divorced | 14.580000 |
41 | Married | 14.810000 |
42 | Married | 14.870000 |
43 | Single | 14.900000 |
44 | Single | 14.990000 |
45 | Unknown | 15.020000 |
46 | Married | 15.040000 |
47 | Married | 15.110000 |
48 | Single | 15.240000 |
49 | Married | 15.310000 |
50 | Single | 15.560000 |
51 | Single | 15.720000 |
52 | Single | 15.980000 |
53 | Married | 16.090000 |
54 | Single | 16.240000 |
55 | Single | 16.320000 |
56 | Single | 16.430000 |
57 | Single | 16.690000 |
58 | Married | 16.710000 |
59 | Married | 16.760000 |
60 | Single | 16.860000 |
61 | Single | 16.880000 |
62 | Unknown | 16.890000 |
63 | Married | 17.720000 |
64 | Married | 17.840000 |
65 | Separated | 17.980000 |
66 | Married | 18.050000 |
67 | Married | 18.050000 |
68 | Married | 18.080000 |
69 | Divorced | 18.170000 |
70 | Married | 18.200000 |
71 | Married | 18.330000 |
72 | Single | 18.360000 |
73 | Single | 18.410000 |
74 | Married | 18.650000 |
75 | Married | 18.800000 |
76 | Single | 18.880000 |
77 | Married | 18.910000 |
78 | Unknown | 18.950000 |
79 | Unknown | 19.160000 |
80 | Married | 19.220000 |
81 | Separated | 19.360000 |
82 | Divorced | 19.410000 |
83 | Married | 19.460000 |
84 | Divorced | 19.490000 |
85 | Married | 19.500000 |
86 | Single | 19.520000 |
87 | Divorced | 19.520000 |
88 | Married | 19.660000 |
89 | Married | 19.740000 |
90 | Unknown | 82.080000 |
91 | Divorced | 83.750000 |
92 | Divorced | 88.650000 |
93 | Widowed | 88.660000 |
94 | Single | 89.440000 |
95 | Single | 93.230000 |
96 | Married | 93.600000 |
97 | Single | 94.000000 |
98 | Married | 94.060000 |
99 | Separated | 98.400000 |
Query: Select the PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
columns from
the PatientCorePopulatedTable
table and sort by PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
.
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, PatientPopulationPercentageBelowPoverty;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, PatientPopulationPercentageBelowPoverty;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|
0 | Divorced | 6.480000 |
1 | Divorced | 11.080000 |
2 | Divorced | 11.880000 |
3 | Divorced | 14.070000 |
4 | Divorced | 14.580000 |
5 | Divorced | 18.170000 |
6 | Divorced | 19.410000 |
7 | Divorced | 19.490000 |
8 | Divorced | 19.520000 |
9 | Divorced | 83.750000 |
10 | Divorced | 88.650000 |
11 | Married | 3.300000 |
12 | Married | 4.340000 |
13 | Married | 4.510000 |
14 | Married | 5.940000 |
15 | Married | 6.670000 |
16 | Married | 8.810000 |
17 | Married | 9.100000 |
18 | Married | 10.300000 |
19 | Married | 10.350000 |
20 | Married | 10.870000 |
21 | Married | 11.250000 |
22 | Married | 11.890000 |
23 | Married | 12.010000 |
24 | Married | 12.410000 |
25 | Married | 12.800000 |
26 | Married | 13.230000 |
27 | Married | 13.900000 |
28 | Married | 14.030000 |
29 | Married | 14.150000 |
30 | Married | 14.200000 |
31 | Married | 14.810000 |
32 | Married | 14.870000 |
33 | Married | 15.040000 |
34 | Married | 15.110000 |
35 | Married | 15.310000 |
36 | Married | 16.090000 |
37 | Married | 16.710000 |
38 | Married | 16.760000 |
39 | Married | 17.720000 |
40 | Married | 17.840000 |
41 | Married | 18.050000 |
42 | Married | 18.050000 |
43 | Married | 18.080000 |
44 | Married | 18.200000 |
45 | Married | 18.330000 |
46 | Married | 18.650000 |
47 | Married | 18.800000 |
48 | Married | 18.910000 |
49 | Married | 19.220000 |
50 | Married | 19.460000 |
51 | Married | 19.500000 |
52 | Married | 19.660000 |
53 | Married | 19.740000 |
54 | Married | 93.600000 |
55 | Married | 94.060000 |
56 | Separated | 10.800000 |
57 | Separated | 13.030000 |
58 | Separated | 17.980000 |
59 | Separated | 19.360000 |
60 | Separated | 98.400000 |
61 | Single | 1.700000 |
62 | Single | 7.590000 |
63 | Single | 11.040000 |
64 | Single | 11.150000 |
65 | Single | 11.160000 |
66 | Single | 11.430000 |
67 | Single | 12.860000 |
68 | Single | 12.920000 |
69 | Single | 13.100000 |
70 | Single | 13.970000 |
71 | Single | 14.280000 |
72 | Single | 14.490000 |
73 | Single | 14.490000 |
74 | Single | 14.900000 |
75 | Single | 14.990000 |
76 | Single | 15.240000 |
77 | Single | 15.560000 |
78 | Single | 15.720000 |
79 | Single | 15.980000 |
80 | Single | 16.240000 |
81 | Single | 16.320000 |
82 | Single | 16.430000 |
83 | Single | 16.690000 |
84 | Single | 16.860000 |
85 | Single | 16.880000 |
86 | Single | 18.360000 |
87 | Single | 18.410000 |
88 | Single | 18.880000 |
89 | Single | 19.520000 |
90 | Single | 89.440000 |
91 | Single | 93.230000 |
92 | Single | 94.000000 |
93 | Unknown | 12.380000 |
94 | Unknown | 15.020000 |
95 | Unknown | 16.890000 |
96 | Unknown | 18.950000 |
97 | Unknown | 19.160000 |
98 | Unknown | 82.080000 |
99 | Widowed | 88.660000 |
13.2.1.10.2. Ascending versus Descending Sort Order#
Query: Select the PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
columns from
the PatientCorePopulatedTable
table and sort by PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
by descending order
using the keyword DESC
.
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, PatientPopulationPercentageBelowPoverty DESC;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, PatientPopulationPercentageBelowPoverty DESC;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|
0 | Divorced | 88.650000 |
1 | Divorced | 83.750000 |
2 | Divorced | 19.520000 |
3 | Divorced | 19.490000 |
4 | Divorced | 19.410000 |
5 | Divorced | 18.170000 |
6 | Divorced | 14.580000 |
7 | Divorced | 14.070000 |
8 | Divorced | 11.880000 |
9 | Divorced | 11.080000 |
10 | Divorced | 6.480000 |
11 | Married | 94.060000 |
12 | Married | 93.600000 |
13 | Married | 19.740000 |
14 | Married | 19.660000 |
15 | Married | 19.500000 |
16 | Married | 19.460000 |
17 | Married | 19.220000 |
18 | Married | 18.910000 |
19 | Married | 18.800000 |
20 | Married | 18.650000 |
21 | Married | 18.330000 |
22 | Married | 18.200000 |
23 | Married | 18.080000 |
24 | Married | 18.050000 |
25 | Married | 18.050000 |
26 | Married | 17.840000 |
27 | Married | 17.720000 |
28 | Married | 16.760000 |
29 | Married | 16.710000 |
30 | Married | 16.090000 |
31 | Married | 15.310000 |
32 | Married | 15.110000 |
33 | Married | 15.040000 |
34 | Married | 14.870000 |
35 | Married | 14.810000 |
36 | Married | 14.200000 |
37 | Married | 14.150000 |
38 | Married | 14.030000 |
39 | Married | 13.900000 |
40 | Married | 13.230000 |
41 | Married | 12.800000 |
42 | Married | 12.410000 |
43 | Married | 12.010000 |
44 | Married | 11.890000 |
45 | Married | 11.250000 |
46 | Married | 10.870000 |
47 | Married | 10.350000 |
48 | Married | 10.300000 |
49 | Married | 9.100000 |
50 | Married | 8.810000 |
51 | Married | 6.670000 |
52 | Married | 5.940000 |
53 | Married | 4.510000 |
54 | Married | 4.340000 |
55 | Married | 3.300000 |
56 | Separated | 98.400000 |
57 | Separated | 19.360000 |
58 | Separated | 17.980000 |
59 | Separated | 13.030000 |
60 | Separated | 10.800000 |
61 | Single | 94.000000 |
62 | Single | 93.230000 |
63 | Single | 89.440000 |
64 | Single | 19.520000 |
65 | Single | 18.880000 |
66 | Single | 18.410000 |
67 | Single | 18.360000 |
68 | Single | 16.880000 |
69 | Single | 16.860000 |
70 | Single | 16.690000 |
71 | Single | 16.430000 |
72 | Single | 16.320000 |
73 | Single | 16.240000 |
74 | Single | 15.980000 |
75 | Single | 15.720000 |
76 | Single | 15.560000 |
77 | Single | 15.240000 |
78 | Single | 14.990000 |
79 | Single | 14.900000 |
80 | Single | 14.490000 |
81 | Single | 14.490000 |
82 | Single | 14.280000 |
83 | Single | 13.970000 |
84 | Single | 13.100000 |
85 | Single | 12.920000 |
86 | Single | 12.860000 |
87 | Single | 11.430000 |
88 | Single | 11.160000 |
89 | Single | 11.150000 |
90 | Single | 11.040000 |
91 | Single | 7.590000 |
92 | Single | 1.700000 |
93 | Unknown | 82.080000 |
94 | Unknown | 19.160000 |
95 | Unknown | 18.950000 |
96 | Unknown | 16.890000 |
97 | Unknown | 15.020000 |
98 | Unknown | 12.380000 |
99 | Widowed | 88.660000 |
Query: Select the PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
columns from
the PatientCorePopulatedTable
table and sort by PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
by descending order
using the -
.
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, -PatientPopulationPercentageBelowPoverty;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, -PatientPopulationPercentageBelowPoverty;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|
0 | Divorced | 88.650000 |
1 | Divorced | 83.750000 |
2 | Divorced | 19.520000 |
3 | Divorced | 19.490000 |
4 | Divorced | 19.410000 |
5 | Divorced | 18.170000 |
6 | Divorced | 14.580000 |
7 | Divorced | 14.070000 |
8 | Divorced | 11.880000 |
9 | Divorced | 11.080000 |
10 | Divorced | 6.480000 |
11 | Married | 94.060000 |
12 | Married | 93.600000 |
13 | Married | 19.740000 |
14 | Married | 19.660000 |
15 | Married | 19.500000 |
16 | Married | 19.460000 |
17 | Married | 19.220000 |
18 | Married | 18.910000 |
19 | Married | 18.800000 |
20 | Married | 18.650000 |
21 | Married | 18.330000 |
22 | Married | 18.200000 |
23 | Married | 18.080000 |
24 | Married | 18.050000 |
25 | Married | 18.050000 |
26 | Married | 17.840000 |
27 | Married | 17.720000 |
28 | Married | 16.760000 |
29 | Married | 16.710000 |
30 | Married | 16.090000 |
31 | Married | 15.310000 |
32 | Married | 15.110000 |
33 | Married | 15.040000 |
34 | Married | 14.870000 |
35 | Married | 14.810000 |
36 | Married | 14.200000 |
37 | Married | 14.150000 |
38 | Married | 14.030000 |
39 | Married | 13.900000 |
40 | Married | 13.230000 |
41 | Married | 12.800000 |
42 | Married | 12.410000 |
43 | Married | 12.010000 |
44 | Married | 11.890000 |
45 | Married | 11.250000 |
46 | Married | 10.870000 |
47 | Married | 10.350000 |
48 | Married | 10.300000 |
49 | Married | 9.100000 |
50 | Married | 8.810000 |
51 | Married | 6.670000 |
52 | Married | 5.940000 |
53 | Married | 4.510000 |
54 | Married | 4.340000 |
55 | Married | 3.300000 |
56 | Separated | 98.400000 |
57 | Separated | 19.360000 |
58 | Separated | 17.980000 |
59 | Separated | 13.030000 |
60 | Separated | 10.800000 |
61 | Single | 94.000000 |
62 | Single | 93.230000 |
63 | Single | 89.440000 |
64 | Single | 19.520000 |
65 | Single | 18.880000 |
66 | Single | 18.410000 |
67 | Single | 18.360000 |
68 | Single | 16.880000 |
69 | Single | 16.860000 |
70 | Single | 16.690000 |
71 | Single | 16.430000 |
72 | Single | 16.320000 |
73 | Single | 16.240000 |
74 | Single | 15.980000 |
75 | Single | 15.720000 |
76 | Single | 15.560000 |
77 | Single | 15.240000 |
78 | Single | 14.990000 |
79 | Single | 14.900000 |
80 | Single | 14.490000 |
81 | Single | 14.490000 |
82 | Single | 14.280000 |
83 | Single | 13.970000 |
84 | Single | 13.100000 |
85 | Single | 12.920000 |
86 | Single | 12.860000 |
87 | Single | 11.430000 |
88 | Single | 11.160000 |
89 | Single | 11.150000 |
90 | Single | 11.040000 |
91 | Single | 7.590000 |
92 | Single | 1.700000 |
93 | Unknown | 82.080000 |
94 | Unknown | 19.160000 |
95 | Unknown | 18.950000 |
96 | Unknown | 16.890000 |
97 | Unknown | 15.020000 |
98 | Unknown | 12.380000 |
99 | Widowed | 88.660000 |
13.2.1.10.3. Sorting using numerical position#
When sorting by numerical position, you cannot use -
. You must use the keyword DESC
. Column names make your code more explicit, which you should prefer.
Query: Select the PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
columns from
the PatientCorePopulatedTable
table and sort by PatientMaritalStatus
and PatientPopulationPercentageBelowPoverty
by descending order
using the keyword DESC
, but using the numerical position of PatientPopulationPercentageBelowPoverty
.
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, 2 DESC;
Show code cell source
import sqlite3
import pandas as pd
conn = sqlite3.connect('100_patients.db')
sql_statement = """
SELECT
PatientMaritalStatus,
PatientPopulationPercentageBelowPoverty
FROM
PatientCorePopulatedTable
ORDER BY PatientMaritalStatus, 2 DESC;
"""
df = pd.read_sql_query(sql_statement, conn)
df.style.set_table_attributes('style="font-size: 12px"')
PatientMaritalStatus | PatientPopulationPercentageBelowPoverty | |
---|---|---|
0 | Divorced | 88.650000 |
1 | Divorced | 83.750000 |
2 | Divorced | 19.520000 |
3 | Divorced | 19.490000 |
4 | Divorced | 19.410000 |
5 | Divorced | 18.170000 |
6 | Divorced | 14.580000 |
7 | Divorced | 14.070000 |
8 | Divorced | 11.880000 |
9 | Divorced | 11.080000 |
10 | Divorced | 6.480000 |
11 | Married | 94.060000 |
12 | Married | 93.600000 |
13 | Married | 19.740000 |
14 | Married | 19.660000 |
15 | Married | 19.500000 |
16 | Married | 19.460000 |
17 | Married | 19.220000 |
18 | Married | 18.910000 |
19 | Married | 18.800000 |
20 | Married | 18.650000 |
21 | Married | 18.330000 |
22 | Married | 18.200000 |
23 | Married | 18.080000 |
24 | Married | 18.050000 |
25 | Married | 18.050000 |
26 | Married | 17.840000 |
27 | Married | 17.720000 |
28 | Married | 16.760000 |
29 | Married | 16.710000 |
30 | Married | 16.090000 |
31 | Married | 15.310000 |
32 | Married | 15.110000 |
33 | Married | 15.040000 |
34 | Married | 14.870000 |
35 | Married | 14.810000 |
36 | Married | 14.200000 |
37 | Married | 14.150000 |
38 | Married | 14.030000 |
39 | Married | 13.900000 |
40 | Married | 13.230000 |
41 | Married | 12.800000 |
42 | Married | 12.410000 |
43 | Married | 12.010000 |
44 | Married | 11.890000 |
45 | Married | 11.250000 |
46 | Married | 10.870000 |
47 | Married | 10.350000 |
48 | Married | 10.300000 |
49 | Married | 9.100000 |
50 | Married | 8.810000 |
51 | Married | 6.670000 |
52 | Married | 5.940000 |
53 | Married | 4.510000 |
54 | Married | 4.340000 |
55 | Married | 3.300000 |
56 | Separated | 98.400000 |
57 | Separated | 19.360000 |
58 | Separated | 17.980000 |
59 | Separated | 13.030000 |
60 | Separated | 10.800000 |
61 | Single | 94.000000 |
62 | Single | 93.230000 |
63 | Single | 89.440000 |
64 | Single | 19.520000 |
65 | Single | 18.880000 |
66 | Single | 18.410000 |
67 | Single | 18.360000 |
68 | Single | 16.880000 |
69 | Single | 16.860000 |
70 | Single | 16.690000 |
71 | Single | 16.430000 |
72 | Single | 16.320000 |
73 | Single | 16.240000 |
74 | Single | 15.980000 |
75 | Single | 15.720000 |
76 | Single | 15.560000 |
77 | Single | 15.240000 |
78 | Single | 14.990000 |
79 | Single | 14.900000 |
80 | Single | 14.490000 |
81 | Single | 14.490000 |
82 | Single | 14.280000 |
83 | Single | 13.970000 |
84 | Single | 13.100000 |
85 | Single | 12.920000 |
86 | Single | 12.860000 |
87 | Single | 11.430000 |
88 | Single | 11.160000 |
89 | Single | 11.150000 |
90 | Single | 11.040000 |
91 | Single | 7.590000 |
92 | Single | 1.700000 |
93 | Unknown | 82.080000 |
94 | Unknown | 19.160000 |
95 | Unknown | 18.950000 |
96 | Unknown | 16.890000 |
97 | Unknown | 15.020000 |
98 | Unknown | 12.380000 |
99 | Widowed | 88.660000 |