import pandas as pd
18 Pandas Basics Part 1 — Workbook
If you want to save your work from this notebook, you should be sure to make a copy of it on your own computer.
In this workbook, we’re going to explore the basics of the Python library Pandas.
18.1 Import Pandas
To use the Pandas library, we first need to import
it.
18.2 Change Display Settings
By default, Pandas will display 60 rows and 20 columns. I often change Pandas’ default display settings to show more rows or columns.
= 200 pd.options.display.max_rows
18.3 Get Data
To read in a CSV file, we will use the method pd.read_csv()
and insert the name of our desired file path.
'Bellevue_Almshouse_Dataset.csv') pd.read_csv(
date_in | first_name | last_name | full_name | age | gender | disease | profession | children | sent_to | sender1 | sender2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1847-04-17 | Mary | Gallagher | Mary Gallagher | 28.0 | f | recent emigrant | married | Child Alana 10 days | Hospital | superintendent | hd. gibbens |
1 | 1847-04-08 | John | Sanin (?) | John Sanin (?) | 19.0 | m | recent emigrant | laborer | Catherine 2 mo | NaN | george w. anderson | edward witherell |
2 | 1847-04-17 | Anthony | Clark | Anthony Clark | 60.0 | m | recent emigrant | laborer | Charles Riley afed 10 days | Hospital | george w. anderson | edward witherell |
3 | 1847-04-08 | Lawrence | Feeney | Lawrence Feeney | 32.0 | m | recent emigrant | laborer | Child | NaN | george w. anderson | james donnelly |
4 | 1847-04-13 | Henry | Joyce | Henry Joyce | 21.0 | m | recent emigrant | NaN | Child 1 mo | NaN | george w. anderson | edward witherell |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9593 | 1846-05-23 | Joseph | Aton | Joseph Aton | 69.0 | m | NaN | shoemaker | NaN | NaN | [blank] | NaN |
9594 | 1847-06-17 | Mary | Smith | Mary Smith | 47.0 | f | NaN | NaN | NaN | Hospital Ward 38 | [blank] | NaN |
9595 | 1847-06-22 | Francis | Riley | Francis Riley | 29.0 | m | lame | superintendent | NaN | NaN | [blank] | NaN |
9596 | 1847-07-02 | Martin | Dunn | Martin Dunn | 4.0 | m | NaN | NaN | NaN | NaN | [blank] | NaN |
9597 | 1847-07-08 | Elizabeth | Post | Elizabeth Post | 32.0 | f | NaN | NaN | NaN | Hospital | [blank] | NaN |
9598 rows × 12 columns
type(pd.read_csv('Bellevue_Almshouse_Dataset.csv'))
pandas.core.frame.DataFrame
This creates a Pandas DataFrame object, one of the two main data structures in Pandas. A DataFrame looks and acts a lot like a spreadsheet, but it has special powers and functions that we will discuss below and in the next few lessons.
Pandas objects | Explanation |
---|---|
DataFrame |
Like a spreadsheet, 2-dimensional |
Series |
Like a column, 1-dimensional |
We assign the DataFrame to a variable called bellevue_df
. It is common convention to name DataFrame variables df
, but we want to be a bit more specific.
= pd.read_csv('Bellevue_Almshouse_Dataset.csv') bellevue_df
18.4 Begin to Examine Patterns
18.4.1 Select Columns as Series Objects []
To select a column from the DataFrame, we will type the name of the DataFrame followed by square brackets and a column name in quotations marks.
'age'] bellevue_df[
0 28.0
1 19.0
2 60.0
3 32.0
4 21.0
...
9593 69.0
9594 47.0
9595 29.0
9596 4.0
9597 32.0
Name: age, Length: 9598, dtype: float64
Technically, a single column in a DataFrame is a Series object.
type(bellevue_df['age'])
pandas.core.series.Series
18.5 Pandas Methods
Pandas method | Explanation |
---|---|
.sum() |
Sum of values |
.mean() |
Mean of values |
.median() |
Median of values |
.min() |
Minimum |
.max() |
Maximum |
.mode() |
Mode |
.std() |
Unbiased standard deviation |
.count() |
Total number of non-blank values |
.value_counts() |
Frequency of unique values |
18.5.1 ❓ How old (on average) were the people admitted to the Bellevue Almshouse?
'age'] bellevue_df[
0 28.0
1 19.0
2 60.0
3 32.0
4 21.0
...
9593 69.0
9594 47.0
9595 29.0
9596 4.0
9597 32.0
Name: age, Length: 9598, dtype: float64
18.5.2 ❓ How old was the oldest person admitted to Bellevue?
'age'] bellevue_df[
0 28.0
1 19.0
2 60.0
3 32.0
4 21.0
...
9593 69.0
9594 47.0
9595 29.0
9596 4.0
9597 32.0
Name: age, Length: 9598, dtype: float64
18.5.3 ❓ How young was the youngest person?
'age'] bellevue_df[
0 28.0
1 19.0
2 60.0
3 32.0
4 21.0
...
9593 69.0
9594 47.0
9595 29.0
9596 4.0
9597 32.0
Name: age, Length: 9598, dtype: float64
18.5.4 ❓ What were the most common professions among these Irish immigrants?
To count the values in a column, we can use the .value_counts()
method.
What patterns do you notice in this list? What seems strange to you? What can we learn about the people in the dataset and the people who created the dataset?
'profession'].value_counts() bellevue_df[
laborer 3116
married 1586
spinster 1522
widow 1055
shoemaker 158
tailor 116
blacksmith 104
mason 99
weaver 66
carpenter 65
baker 48
waiter 41
clerk 28
stone cutter 27
painter 26
gardener 25
cooper 24
farmer 21
peddler 20
cartman 15
wheelwright 14
hostler 12
hatter 12
printer 11
butcher 11
tinsmith 10
boot maker 10
teacher 10
coachman 10
sailor 10
servant 9
boiler maker 9
harness maker 9
cabinet maker 8
boatman 8
nailer 8
porter 6
plasterer 6
nail maker 6
(illegible) 6
umbrella maker 6
seaman 5
soldier 5
marble polisher 5
grocer 5
machinist 5
brick layer 4
ship carpenter 4
barkeeper 4
book maker 4
hackman 4
turner 4
dyer 4
tinman 4
miner 3
merchant 3
starch maker 3
courier 3
seamstress 3
saddler 3
varnisher 3
locksmith 3
brass founder 3
pavier 3
store keeper 3
brewer 3
paper stainer 3
lawyer 2
barker 2
morocco dresser 2
cab driver 2
carver 2
glass cutter 2
copper smith 2
barber 2
engraver 2
engineer 2
food carrier 2
quarry man 2
rigger 2
stevedore 2
single 2
sail maker 2
stove maker 2
shipwright 2
type caster 2
glove maker 2
tanner 2
sawyer 2
calico printer 2
paper maker 2
chair maker 2
gw anderson per e witherell 1
drayman 1
groom 1
plumber 1
soap boiler 1
paner 1
wool manufacturer 1
joiner 1
hodman 1
leather draper 1
flag cutter 1
rectifier 1
parrier 1
teamster 1
clery 1
book seller 1
manufacturer 1
wood sawyer 1
auctioneer 1
dugget 1
mariner 1
cook 1
polisher 1
sham 1
tavern keeper 1
surveyor 1
ship sawyer 1
oysterman 1
hacker 1
moulder 1
magician 1
builder 1
soapmaker 1
jeweller 1
strap maker 1
upholsterer 1
book keeper 1
school teacher 1
shop keeper 1
iron moulder 1
glover 1
flagger 1
book binder 1
apothecary 1
saw maker 1
marble cutter 1
jobber 1
flag pavier 1
musician 1
marble sawyer 1
leather dresser 1
chandler 1
paper-carrier 1
book pedlar 1
cabrener 1
miniature painter 1
cabman 1
truss maker 1
marketmab 1
copoper 1
stage driver 1
brass turner 1
fishman 1
brush maker 1
soap comber 1
manow(?) 1
waterman 1
cloth printer 1
stone sawyer 1
schoolmaster 1
dancing master 1
music painter 1
croper 1
cotton sampler 1
gas fitter 1
gas manufacturer 1
caulker 1
basket maker 1
gun maker 1
superintendent 1
Name: profession, dtype: int64
18.5.5 ❓ What are the most common diseases?
'disease'].value_counts() bellevue_df[
sickness 2710
recent emigrant 1975
destitution 846
fever 192
insane 138
pregnant 134
sore 79
intemperance 71
illegible 47
typhus 46
injuries 32
ulcers 26
ophthalmia 19
vagrant 17
lame 15
debility 12
rheumatism 11
blind 9
bronchitis 9
dropsy 8
phthisis 8
syphilis 7
old age 7
dysentery 6
erysipelas 6
diarrhea 6
cripple 5
broken bone 5
measles 3
drunkenness 3
burn 3
delusion dreams 2
scrofula 2
tuberculosis 2
pneumonia 2
fits 2
abandonment 2
piles 2
sprain 2
jaundice 2
scarletina 2
phagadaena 1
spinal disease 1
tumor 1
smallpox 1
horrors 1
hernia 1
paralysis 1
abscess 1
neuralgia 1
hypochondria 1
ungovernable 1
from trial 1
sunburn 1
colic 1
orchitis 1
beggar 1
contusion 1
rickets 1
ascites 1
cut 1
deaf 1
congested head 1
eczema 1
bruise 1
severed limb 1
emotional 1
poorly 1
disabled 1
bleeding 1
seizure 1
del femur 1
throat cut 1
ague 1
asthma 1
Name: disease, dtype: int64
18.5.6 ❓ Where were most people sent?
'sent_to'].value_counts() bellevue_df[
Hospital 3882
Blackwell's Island 571
Bellevue Garret 250
Randall's Island 172
Shanty 109
Lunatic Asylum 93
CHECK 90
Bellevue Hospital Chapel 78
Almshouse 64
Hospital Ward 38 62
Long Island Farms 45
Hospital Ward 46 42
Lunatic Asylum (?) 37
Hospital Ward 18 23
Blackwell's Island Ward 38 14
Hospital Ward 39 11
Hospital Ward 16 11
Hospital Blackwell's Island 10
Hospital on Blackwell's Island 10
Lunatic Asylum Ward 38 5
Hospital Ward 13 5
Hosptial Ward 17 5
Hospital Ward 11 4
Hospital Ward 17 3
Hospital Morgue 3
Hospital Ward 22 3
Hospital Ward 9 3
Almshouse on Blackwell's Island 2
Hospital Ward 45 2
Children's Home 2
Hospital Ward 6 2
Blackwell's Island Ward 39 2
Lunatic Asylum Ward 28 2
Hospital Ward 32 2
Hospital Ward 42 2
Hospital Ward 24 2
Hospital Ward 12 2
Shanty 38 2
Blackwell's Island Workhouse (?) 1
Lunatic Asylum Ward 5 1
Blackwell's Island Ward 19 1
Blackwell's Island Ward 7 1
Blackwell's Island Ward 8 1
Randall's Island Ward 8 1
Hospital Women's Ward 1
Hospital Ward 5 1
Lunatic Asylum Blackwell's Island 1
Blackwell's Island Ward 17 1
Shanty 6 1
Randall's Island Ward 38 1
Hospital Ward 28 1
Hospital Ward 34 1
Hosital Ward 45 1
Blackwell's Island Ward 11 1
Hospital Ward 26 1
Blackwell's Island Ward 18 1
Blackwell's Island Ward 5 1
Blackwell's Island Shanty 1
Blackwell's Island Ward 10 1
Hospital Ward 36 1
Hospital Ward 35 1
Hospital Ward 21 1
Hospital Ward 15 1
Blackwell's Island Ward 16 1
Women's Hospital 1
Almshouse Hospital 1
Blackwell's Island Ward 4 1
Hospital Ward 29 1
Randall's Island Ward 17 1
Shanty 18 1
Shanty 1 1
Randall's Island Ward 6 1
Blackwell's Island Ward 35 1
Hospital Ward 30 1
Lunatic Asylum Ward 18 1
Lunatic Asylum Ward 16 1
Shanty 8 1
Name: sent_to, dtype: int64
18.6 Examine Subsets
18.6.1 ❓ Why were people being sent to Hostpital Ward 38?
To explore this question, we can filter rows with a condition.
'sent_to'] == 'Hospital Ward 38' bellevue_df[
0 False
1 False
2 False
3 False
4 False
...
9593 False
9594 True
9595 False
9596 False
9597 False
Name: sent_to, Length: 9598, dtype: bool
'sent_to'] == 'Hospital Ward 38'] bellevue_df[bellevue_df[
date_in | first_name | last_name | full_name | age | gender | disease | profession | children | sent_to | sender1 | sender2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
249 | 1847-05-17 | Elizabeth | Cauley | Elizabeth Cauley | 24.0 | f | recent emigrant | married | Son Walter 4 mo | Hospital Ward 38 | moses g. leonard | peter c. johnston |
330 | 1847-03-22 | Sarah | Corrigan | Sarah Corrigan | 21.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
367 | 1847-06-13 | Bridget | Reynolds | Bridget Reynolds | 20.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | oscar s. field |
499 | 1847-06-09 | Rose | Dinns | Rose Dinns | 22.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
698 | 1847-08-21 | Bridget | Redding | Bridget Redding | 25.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
1041 | 1847-03-13 | Betty | Dunn | Betty Dunn | 34.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
1290 | 1847-02-17 | Catherine | Doherty | Catherine Doherty | 34.0 | f | recent emigrant | widow | NaN | Hospital Ward 38 | george w. anderson | NaN |
1426 | 1847-05-31 | Catherine | Riley | Catherine Riley | 28.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
1494 | 1847-06-14 | Brigt | Frikee? | Brigt Frikee? | 25.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
1584 | 1847-10-28 | Ellen | Lamb | Ellen Lamb | 20.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
1685 | 1847-03-09 | Catherine | McManus | Catherine McManus | 21.0 | f | recent emigrant | widow | NaN | Hospital Ward 38 | george w. anderson | james donnelly |
1746 | 1847-05-08 | Ellen | O'Brien | Ellen O'Brien | 20.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
1863 | 1847-06-01 | Bridget | Nash | Bridget Nash | 30.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
1960 | 1847-05-13 | Mary | Gallagher | Mary Gallagher | 18.0 | f | pregnant | married | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
2370 | 1847-02-13 | Eliza | Latimer | Eliza Latimer | 19.0 | f | recent emigrant | married | NaN | Hospital Ward 38 | george w. anderson | NaN |
5194 | 1847-02-22 | Mary | Toohey | Mary Toohey | 34.0 | f | pregnant | married | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
5286 | 1847-03-16 | Mary | Mullen | Mary Mullen | 27.0 | f | pregnant | married | NaN | Hospital Ward 38 | george w. anderson | james donnelly |
5309 | 1847-03-22 | Margaret | Welsh | Margaret Welsh | 24.0 | f | pregnant | married | NaN | Hospital Ward 38 | george w. anderson | benson s. hopkins |
5310 | 1847-03-22 | Elizabeth | McDonald | Elizabeth McDonald | 26.0 | f | pregnant | married | NaN | Hospital Ward 38 | george w. anderson | edward witherell |
5321 | 1847-03-24 | Mary Jane | Stevens | Mary Jane Stevens | 30.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | george w. anderson | edward witherell |
5445 | 1847-04-15 | Alice | Duffy | Alice Duffy | 28.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
5629 | 1847-05-10 | Jane | McCann | Jane McCann | 22.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
5636 | 1847-05-10 | Susan | Patterson | Susan Patterson | 20.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | george w. anderson | peter c. johnston |
5777 | 1847-08-11 | Bridget | Grady | Bridget Grady | 35.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | thomas b. tappen | NaN |
5935 | 1847-07-15 | Ellen | Connolly | Ellen Connolly | 25.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
6003 | 1847-07-19 | Mary | Kelly | Mary Kelly | 27.0 | f | sickness | married | NaN | Hospital Ward 38 | william w. lyons | charles j. sutton |
6051 | 1847-07-22 | Ann | Clark | Ann Clark | 23.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
6052 | 1847-07-22 | Mary | Diffen | Mary Diffen | 32.0 | f | pregnant | widow | NaN | Hospital Ward 38 | william w. lyons | NaN |
6053 | 1847-07-22 | Mary Ann | Williamson | Mary Ann Williamson | 50.0 | f | pregnant | widow | NaN | Hospital Ward 38 | william w. lyons | NaN |
6102 | 1847-07-25 | Margaret | McCabe | Margaret McCabe | 40.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
6265 | 1847-08-06 | Elizabeth | Wilkinson | Elizabeth Wilkinson | 13.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
7140 | 1847-10-20 | Bridget | Riley | Bridget Riley | 18.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7209 | 1847-10-27 | Ann | Rourke | Ann Rourke | 20.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7221 | 1847-10-28 | Catherine | McDermott | Catherine McDermott | 28.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
7280 | 1847-11-02 | Ellen | Sweeney | Ellen Sweeney | 20.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7297 | 1847-11-04 | Mary | Smith | Mary Smith | 30.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7312 | 1847-11-05 | Bridget | Campbell | Bridget Campbell | 23.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7338 | 1847-11-08 | Bridget | Laughlin | Bridget Laughlin | 21.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | william w. lyons | NaN |
7347 | 1847-11-09 | Catherine | Gillespie | Catherine Gillespie | 34.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
7417 | 1847-11-16 | Eliza | Martin | Eliza Martin | 30.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
7470 | 1847-11-20 | Catherine | O'Brien | Catherine O'Brien | 29.0 | f | pregnant | married | NaN | Hospital Ward 38 | william w. lyons | NaN |
8009 | 1847-06-15 | Hannah | Keatherson | Hannah Keatherson | 14.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8100 | 1847-02-12 | Mary | McGrath | Mary McGrath | 26.0 | f | pregnant | widow | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8101 | 1847-02-18 | Jane | Smith | Jane Smith | 25.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8125 | 1847-04-19 | Mary | Smith | Mary Smith | 23.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8129 | 1847-04-22 | Ellen | McCalu | Ellen McCalu | 24.0 | f | pregnant | seamstress | NaN | Hospital Ward 38 | moses g. leonard | commissioners of emigration |
8134 | 1847-05-08 | Judah | Fallen | Judah Fallen | 30.0 | f | pregnant | widow | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8180 | 1847-05-15 | Catherine | Seele | Catherine Seele | 25.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8203 | 1847-05-18 | Mary | Andrews | Mary Andrews | 26.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
8285 | 1847-05-26 | Ellen | McCake | Ellen McCake | 23.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8303 | 1847-05-27 | Sarah | Campbell | Sarah Campbell | 22.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8364 | 1847-06-01 | Bridget | Keanan | Bridget Keanan | 20.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | benson s. hopkins |
8379 | 1847-06-02 | Sarah | Cormick | Sarah Cormick | 46.0 | f | destitution | widow | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8410 | 1847-06-04 | Mary | Hart | Mary Hart | 37.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8424 | 1847-06-05 | Ann | Smith | Ann Smith | 24.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8479 | 1847-06-11 | Mary | McMagh | Mary McMagh | 28.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | moses g. leonard | oscar s. field |
8498 | 1847-06-13 | Mary | Henry | Mary Henry | 27.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
8506 | 1847-06-14 | Mary | Keeler | Mary Keeler | 44.0 | f | pregnant | cook | NaN | Hospital Ward 38 | moses g. leonard | oscar s. field |
8541 | 1847-06-17 | Mary | Smith | Mary Smith | 23.0 | f | pregnant | NaN | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
8581 | 1847-06-19 | Martha | McConnelly | Martha McConnelly | 35.0 | f | pregnant | spinster | NaN | Hospital Ward 38 | moses g. leonard | peter c. johnston |
8714 | 1847-06-26 | Jane | Davis | Jane Davis | 18.0 | f | pregnant | married | NaN | Hospital Ward 38 | moses g. leonard | edward witherell |
9594 | 1847-06-17 | Mary | Smith | Mary Smith | 47.0 | f | NaN | NaN | NaN | Hospital Ward 38 | [blank] | NaN |
18.7 ❓ What data is missing? What data do you wish we had?
18.8 Overview
Generate descriptive statistics for all the columns in the data
bellevue_df.describe()
age | |
---|---|
count | 9548.000000 |
mean | 30.337039 |
std | 14.179527 |
min | 0.080000 |
25% | 21.000000 |
50% | 28.000000 |
75% | 39.000000 |
max | 97.000000 |
='all') bellevue_df.describe(include
date_in | first_name | last_name | full_name | age | gender | disease | profession | children | sent_to | sender1 | sender2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 9598 | 9594 | 9598 | 9598 | 9548.000000 | 9598 | 6509 | 8579 | 37 | 5666 | 9521 | 5212 |
unique | 653 | 523 | 3159 | 7308 | NaN | 5 | 75 | 172 | 36 | 77 | 59 | 80 |
top | 1847-05-24 | Mary | Kelly | Mary Smith | NaN | m | sickness | laborer | Child | Hospital | george w. anderson | peter c. johnston |
freq | 113 | 979 | 137 | 21 | NaN | 4967 | 2710 | 3116 | 2 | 3882 | 3469 | 1666 |
mean | NaN | NaN | NaN | NaN | 30.337039 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
std | NaN | NaN | NaN | NaN | 14.179527 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
min | NaN | NaN | NaN | NaN | 0.080000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
25% | NaN | NaN | NaN | NaN | 21.000000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
50% | NaN | NaN | NaN | NaN | 28.000000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
75% | NaN | NaN | NaN | NaN | 39.000000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
max | NaN | NaN | NaN | NaN | 97.000000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
Generate information about all the columns in the data
bellevue_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9598 entries, 0 to 9597
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 date_in 9598 non-null object
1 first_name 9594 non-null object
2 last_name 9598 non-null object
3 full_name 9598 non-null object
4 age 9548 non-null float64
5 gender 9598 non-null object
6 disease 6509 non-null object
7 profession 8579 non-null object
8 children 37 non-null object
9 sent_to 5666 non-null object
10 sender1 9521 non-null object
11 sender2 5212 non-null object
dtypes: float64(1), object(11)
memory usage: 899.9+ KB
Make a histogram of the DataFrame
bellevue_df.hist()
array([[<AxesSubplot:title={'center':'age'}>]], dtype=object)
If there is anything wrong, please open an issue on GitHub or email f.pianzola@rug.nl