Skip to content

Commit e1ff756

Browse files
author
craigsdennis
committed
Initial commit
0 parents  commit e1ff756

37 files changed

+249007
-0
lines changed

.gitignore

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
2+
# Created by https://www.gitignore.io/api/jupyternotebook
3+
# Edit at https://www.gitignore.io/?templates=jupyternotebook
4+
5+
### JupyterNotebook ###
6+
.ipynb_checkpoints
7+
*/.ipynb_checkpoints/*
8+
9+
# Remove previous ipynb_checkpoints
10+
# git rm -r .ipynb_checkpoints/
11+
#
12+
13+
# End of https://www.gitignore.io/api/jupyternotebook

s1v4/BodyMeasures.csv

Lines changed: 9279 additions & 0 deletions
Large diffs are not rendered by default.

s1v4/Demographics.csv

Lines changed: 10587 additions & 0 deletions
Large diffs are not rendered by default.

s1v4/Occupation.csv

Lines changed: 7750 additions & 0 deletions
Large diffs are not rendered by default.

s1v4/Stage1-Video4.ipynb

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"Import the pandas library and load in the data files\n",
8+
"---"
9+
]
10+
},
11+
{
12+
"cell_type": "code",
13+
"execution_count": null,
14+
"metadata": {},
15+
"outputs": [],
16+
"source": [
17+
"import pandas as pd\n",
18+
"\n",
19+
"demo = pd.read_csv('Demographics.csv')\n",
20+
"bmx = pd.read_csv('BodyMeasures.csv')\n",
21+
"ocp = pd.read_csv('Occupation.csv')"
22+
]
23+
},
24+
{
25+
"cell_type": "markdown",
26+
"metadata": {},
27+
"source": [
28+
"Describe the numeric columns in the demographics DataFrame\n",
29+
"---"
30+
]
31+
},
32+
{
33+
"cell_type": "code",
34+
"execution_count": null,
35+
"metadata": {},
36+
"outputs": [],
37+
"source": [
38+
"demo.describe()"
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"metadata": {},
44+
"source": [
45+
"Display the first few rows of the demographics DataFrame\n",
46+
"---"
47+
]
48+
},
49+
{
50+
"cell_type": "code",
51+
"execution_count": null,
52+
"metadata": {},
53+
"outputs": [],
54+
"source": [
55+
"demo.head()"
56+
]
57+
},
58+
{
59+
"cell_type": "markdown",
60+
"metadata": {},
61+
"source": [
62+
"Select the first five columns by name\n",
63+
"---"
64+
]
65+
},
66+
{
67+
"cell_type": "code",
68+
"execution_count": null,
69+
"metadata": {},
70+
"outputs": [],
71+
"source": [
72+
"demo.loc[:,['SEQN','SDDSRVYR','RIDSTATR', 'RIDEXMON', 'RIAGENDR']].head()"
73+
]
74+
},
75+
{
76+
"cell_type": "markdown",
77+
"metadata": {},
78+
"source": [
79+
"Select the first five columns by numeric location\n",
80+
"---"
81+
]
82+
},
83+
{
84+
"cell_type": "code",
85+
"execution_count": null,
86+
"metadata": {},
87+
"outputs": [],
88+
"source": [
89+
"demo.iloc[0:4,0:5]"
90+
]
91+
},
92+
{
93+
"cell_type": "markdown",
94+
"metadata": {},
95+
"source": [
96+
"Merge the demographics and body measures DataFrames\n",
97+
"---\n",
98+
"\n",
99+
"* Match the values in the SEQN columns between the DataFrames\n",
100+
"* Do an inner join (keep only the data for participants listed in both files)"
101+
]
102+
},
103+
{
104+
"cell_type": "code",
105+
"execution_count": null,
106+
"metadata": {},
107+
"outputs": [],
108+
"source": [
109+
"dataset = pd.merge(demo, bmx, on='SEQN', how='inner')"
110+
]
111+
},
112+
{
113+
"cell_type": "markdown",
114+
"metadata": {},
115+
"source": [
116+
"Save the joined dataset to a new file\n",
117+
"---"
118+
]
119+
},
120+
{
121+
"cell_type": "code",
122+
"execution_count": null,
123+
"metadata": {},
124+
"outputs": [],
125+
"source": [
126+
"# dataset.to_csv('MyDataset.csv', index=False)"
127+
]
128+
}
129+
],
130+
"metadata": {
131+
"kernelspec": {
132+
"display_name": "Python 3",
133+
"language": "python",
134+
"name": "python3"
135+
},
136+
"language_info": {
137+
"codemirror_mode": {
138+
"name": "ipython",
139+
"version": 3
140+
},
141+
"file_extension": ".py",
142+
"mimetype": "text/x-python",
143+
"name": "python",
144+
"nbconvert_exporter": "python",
145+
"pygments_lexer": "ipython3",
146+
"version": "3.7.1"
147+
}
148+
},
149+
"nbformat": 4,
150+
"nbformat_minor": 2
151+
}

s2v1/BodyMeasures.csv

Lines changed: 9283 additions & 0 deletions
Large diffs are not rendered by default.

s2v1/Demographics.csv

Lines changed: 10587 additions & 0 deletions
Large diffs are not rendered by default.

s2v1/Occupation.csv

Lines changed: 7750 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)