QA notes from first half of Dave thoughts

craigsdennis · craigsdennis · commit 09f8e6e5f70f · 2018-11-05T21:21:57.000-08:00
diff --git a/data/creation.py b/data/creation.py
@@ -99,8 +99,8 @@ def user_dict(self):
 
 user_row_dict['adrian'] =  {
     'first_name': 'Adrian',
-    'last_name': 'Yang',
-    'email': 'adrian.yang@teamtreehouse.com',
+    'last_name': 'Fang',
+    'email': 'adrian.fang@teamtreehouse.com',
     'email_verified': fake.email_verified(),
     'signup_date': fake.signup_date(),
     'referral_count': fake.random_int(0, 7),
diff --git a/data/users.csv b/data/users.csv
@@ -2,7 +2,7 @@
 aaron,Aaron,Davis,aaron6348@gmail.com,True,2018-08-31,6,18.14
 acook,Anthony,Cook,cook@gmail.com,True,2018-05-12,2,55.45
 adam.saunders,Adam,Saunders,adam@gmail.com,False,2018-05-29,3,72.12
-adrian,Adrian,Yang,adrian.yang@teamtreehouse.com,True,2018-04-28,3,30.01
+adrian,Adrian,Fang,adrian.fang@teamtreehouse.com,True,2018-04-28,3,30.01
 adrian.blair,Adrian,Blair,adrian9335@gmail.com,True,2018-06-16,7,25.85
 alan9443,Alan,Pope,pope@hotmail.com,True,2018-04-17,0,56.09
 alexander7808,Alexander,Moore,alexander.moore@gmail.com,False,2018-03-27,2,87.71
diff --git a/s1n01-creating-a-series.ipynb b/s1n01-creating-a-series.ipynb
@@ -28,7 +28,10 @@
    "source": [
     "## Creating from a dictionary\n",
     "\n",
-    "Let's use this sample data here. In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
+    "\n",
+    "Let's use this sample data here we got from CashBox. They want to track the balances of their users. This is how much money each user currently has in their account.  CashBox requires that users create a username.\n",
+    "\n",
+    "In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
    ]
   },
   {
@@ -164,7 +167,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note, the order of the labels is guaranteed. "
+    "Note, the order of the labels is guaranteed to match the same order of the supplied index. "
    ]
   },
   {
@@ -195,7 +198,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together."
+    "One thing to remember is that a NumPy array is also iterable, so you can create a new `Series` from an `ndarray`. In fact, you'll find NumPy and Pandas get along very well together."
    ]
   },
   {
@@ -229,7 +232,7 @@
    "source": [
     "## Creating from a scalar and an index\n",
     "\n",
-    "If you pass in a scalar that value will be broadcasted to the keys specified in the index argument"
+    "If you pass in a scalar, remember that is a single value, it will be broadcast to each of the keys specified in the `index` keyword argument."
    ]
   },
   {
@@ -257,6 +260,13 @@
     "pd.Series(20.00, index=[\"guil\", \"jay\", \"james\", \"ben\", \"nick\"])"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In other words, each key is assigned the same scalar value for the entire `Series`."
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
diff --git a/s1n02-accessing-a-series.ipynb b/s1n02-accessing-a-series.ipynb
@@ -6,9 +6,9 @@
    "source": [
     "# Accessing a Series\n",
     "\n",
-    "There are multiple ways to get to the data that is stored in your `Series`. Let's explore the **`balances`** `Series`. \n",
+    "There are multiple ways to get to the data stored in your `Series`. Let's explore the **`balances`** `Series`. \n",
     "\n",
-    "Remember, the `Series` is indexed by username. The label is the username, the value is that user's balance."
+    "Remember, the `Series` is indexed by username. The label is the username, the value is that user's cash balance."
    ]
   },
   {
@@ -95,9 +95,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The value is wrapped in a `NumPy.Scalar` so that it keeps it's data type and will play well with others.\n",
+    "The value is wrapped in a [`NumPy.Scalar`](https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.scalars.html) so that it keeps it's data type and will play well with other data types and NumPy data structures.\n",
     "\n",
-    "The same positional indexing works just as it does with a standard list."
+    "The same positional indexing works just as it does with a standard list. The indices begin start with 0, and negative numbers can be used to access values from the end of the list."
    ]
   },
   {
@@ -156,6 +156,65 @@
     "### `Series` behave like dictionaries"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/markdown": [
+       "The label pasan has a value of 20.0"
+      ],
+      "text/plain": [
+       "<IPython.core.display.Markdown object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/markdown": [
+       "The label treasure has a value of 20.18"
+      ],
+      "text/plain": [
+       "<IPython.core.display.Markdown object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/markdown": [
+       "The label ashley has a value of 1.05"
+      ],
+      "text/plain": [
+       "<IPython.core.display.Markdown object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "data": {
+      "text/markdown": [
+       "The label craig has a value of 42.42"
+      ],
+      "text/plain": [
+       "<IPython.core.display.Markdown object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "for label, value in balances.items():\n",
+    "    render(\"The label {} has a value of {}\".format(label, value))"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 6,
@@ -259,9 +318,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Accessing More Explicitly\n",
+    "## Accessing More Explicitly with `loc` and `iloc`\n",
     "\n",
-    "We are using indexing which can *either* be a label *or* a positional index. This can get confusing. It's possible to be more explicit, [which yes wise Pythonista](https://www.python.org/dev/peps/pep-0020/), is always better than implicit.\n",
+    "So far we have used a label and a positional index to access the value. This can get confusing as to what is being used, a label or a position. Because of this ambiguity, it is possible to be more explicit, [which yes wise Pythonista](https://www.python.org/dev/peps/pep-0020/), is always better than implicit.\n",
     "\n",
     "A `Series` exposes a property named `loc` which can be used to explicitly lookup by label based indices only."
    ]
@@ -321,15 +380,15 @@
     "## Accessing by Slice\n",
     "Like a NumPy array, a `Series` also provides a way to use slices to get different portions of the data, returned as a `Series`.  \n",
     "\n",
-    "*NOTE*: Slicing with indices vs. labels behaves differently. The latter is inclusive."
+    "*WARNING*: Slicing with indices vs. labels behaves differently. The latter is inclusive."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Slicing by Positional Index\n",
-    "When using positional indices, the slice is exclusive..."
+    "When using positional indices, the slice is exclusive. The last item **is not** included."
    ]
   },
   {
@@ -362,7 +421,7 @@
    "metadata": {},
    "source": [
     "### Slicing by Label\n",
-    "When using labels, the slice is inclusive..."
+    "When using labels, the slice is inclusive. The last item **is** included."
    ]
   },
   {
diff --git a/s1n03-vectorization-and-broadcasting.ipynb b/s1n03-vectorization-and-broadcasting.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# Series Vectorization and Broadcasting\n",
     "\n",
-    "Just like NumPy, pandas offers powerful vectorized methods and leans on broadcasting.\n",
+    "Just like NumPy, pandas offers powerful vectorized methods. It also leans on broadcasting.\n",
     "\n",
     "Let's explore!"
    ]
@@ -75,7 +75,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "...it's important to remember to lean on vectorization and skip the loops altogether."
+    "...it's important to remember to lean on vectorization and skip the loops altogether.  Vectorization is faster and as you can see, easier to read and write."
    ]
   },
   {
@@ -119,7 +119,7 @@
    "metadata": {},
    "source": [
     "### Broadcasting a Scalar\n",
-    "Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same opration."
+    "Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same operation."
    ]
   },
   {
@@ -227,7 +227,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Using the `fill_value`\n",
+    "#### Using the `fill_value` parameter\n",
     "It is possible to fill missing values so that everything aligns. The concept is to use the `add` method directly along with the the keyword argument `fill_value`."
    ]
   },
diff --git a/s1n04-creating-a-dataframe.ipynb b/s1n04-creating-a-dataframe.ipynb
@@ -37,7 +37,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "If your data is already in rows and columns you can just pass it along to the constructor.  Labels and Column headings will be automatically generated as a range."
+    "If your data is already in rows and columns, like a list of lists, you can just pass it along to the constructor.  Labels and Column headings will be automatically generated as a range."
    ]
   },
   {
diff --git a/s1n05-accessing-a-dataframe.ipynb b/s1n05-accessing-a-dataframe.ipynb
@@ -5,7 +5,7 @@
    "metadata": {},
    "source": [
     "# Accessing a DataFrame\n",
-    "There are many [different choices for indexing](https://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) DataFrames available.\n",
+    "There are many [different choices for indexing](https://pandas.pydata.org/pandas-docs/stable/indexing.html#different-choices-for-indexing) DataFrames.\n",
     "\n",
     "Let's explore!"
    ]
@@ -36,7 +36,7 @@
     "## Retrieve a specific Series\n",
     "\n",
     "### By Column Name\n",
-    "Each column is actually a `Series`. The `DataFrame` provides access to each of these `Series` by a column name index.\n",
+    "Each column in a `DataFrame` is actually a `Series`. The `DataFrame` provides access to each of these `Series` by a column name index.\n",
     "\n",
     "For instance, to get the **`balance`** `Series`, you could just use that for the index."
    ]
@@ -286,7 +286,9 @@
    "source": [
     "## Retrieve a Specific DataFrame Through Slicing\n",
     "\n",
-    "Using the `loc` and `iloc` properties you can slice an existing `DataFrame` into a new one."
+    "Using the `loc` and `iloc` properties you can slice an existing `DataFrame` into a new one.\n",
+    "\n",
+    "In the example below we use `:` in the rows axis to select all rows, and we specify which columns we want back using a list in the columns axis, ala NumPy Fancy Indexing."
    ]
   },
   {
diff --git a/s2n01-exploration-methods.ipynb b/s2n01-exploration-methods.ipynb
diff --git a/s2n02-selecting-data.ipynb b/s2n02-selecting-data.ipynb
diff --git a/s2n04-manipulation-techniques.ipynb b/s2n04-manipulation-techniques.ipynb

-Original file line number
+Diff line change
 aaron,Aaron,Davis,[email protected],True,2018-08-31,6,18.14
 acook,Anthony,Cook,[email protected],True,2018-05-12,2,55.45
 adam.saunders,Adam,Saunders,[email protected],False,2018-05-29,3,72.12
 -adrian,Adrian,Yang,adrian.yang@teamtreehouse.com,True,2018-04-28,3,30.01
 +adrian,Adrian,Fang,adrian.fang@teamtreehouse.com,True,2018-04-28,3,30.01
 adrian.blair,Adrian,Blair,[email protected],True,2018-06-16,7,25.85
 alan9443,Alan,Pope,[email protected],True,2018-04-17,0,56.09
 alexander7808,Alexander,Moore,[email protected],False,2018-03-27,2,87.71
Original file line number	Diff line number	Diff line change
`@@ -28,7 +28,10 @@`
`28`	`28`	`"source": [`
`29`	`29`	`"## Creating from a dictionary\n",`
`30`	`30`	`"\n",`
`31`		- "Let's use this sample data here. In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
	`31`	`+ "\n",`
	`32`	`+ "Let's use this sample data here we got from CashBox. They want to track the balances of their users. This is how much money each user currently has in their account. CashBox requires that users create a username.\n",`
	`33`	`+ "\n",`
	`34`	+ "In our example, `test_balance_data` is just a standard Python dictionary the key is username, and the value is that user's current account balance. "
`32`	`35`	`]`
`33`	`36`	`},`
`34`	`37`	`{`
`@@ -164,7 +167,7 @@`
`164`	`167`	`"cell_type": "markdown",`
`165`	`168`	`"metadata": {},`
`166`	`169`	`"source": [`
`167`		`- "Note, the order of the labels is guaranteed. "`
	`170`	`+ "Note, the order of the labels is guaranteed to match the same order of the supplied index. "`
`168`	`171`	`]`
`169`	`172`	`},`
`170`	`173`	`{`
`@@ -195,7 +198,7 @@`
`195`	`198`	`"cell_type": "markdown",`
`196`	`199`	`"metadata": {},`
`197`	`200`	`"source": [`
`198`		`- "One thing to remember is that a NumPy array is also iterable. In fact, you'll find NumPy and Pandas get along really well together."`
	`201`	+ "One thing to remember is that a NumPy array is also iterable, so you can create a new `Series` from an `ndarray`. In fact, you'll find NumPy and Pandas get along very well together."
`199`	`202`	`]`
`200`	`203`	`},`
`201`	`204`	`{`
`@@ -229,7 +232,7 @@`
`229`	`232`	`"source": [`
`230`	`233`	`"## Creating from a scalar and an index\n",`
`231`	`234`	`"\n",`
`232`		`- "If you pass in a scalar that value will be broadcasted to the keys specified in the index argument"`
	`235`	+ "If you pass in a scalar, remember that is a single value, it will be broadcast to each of the keys specified in the `index` keyword argument."
`233`	`236`	`]`
`234`	`237`	`},`
`235`	`238`	`{`
`@@ -257,6 +260,13 @@`
`257`	`260`	`"pd.Series(20.00, index=[\"guil\", \"jay\", \"james\", \"ben\", \"nick\"])"`
`258`	`261`	`]`
`259`	`262`	`},`
	`263`	`+ {`
	`264`	`+ "cell_type": "markdown",`
	`265`	`+ "metadata": {},`
	`266`	`+ "source": [`
	`267`	+ "In other words, each key is assigned the same scalar value for the entire `Series`."
	`268`	`+ ]`
	`269`	`+ },`
`260`	`270`	`{`
`261`	`271`	`"cell_type": "markdown",`
`262`	`272`	`"metadata": {},`
Original file line number	Diff line number	Diff line change
`@@ -6,7 +6,7 @@`
`6`	`6`	`"source": [`
`7`	`7`	`"# Series Vectorization and Broadcasting\n",`
`8`	`8`	`"\n",`
`9`		`- "Just like NumPy, pandas offers powerful vectorized methods and leans on broadcasting.\n",`
	`9`	`+ "Just like NumPy, pandas offers powerful vectorized methods. It also leans on broadcasting.\n",`
`10`	`10`	`"\n",`
`11`	`11`	`"Let's explore!"`
`12`	`12`	`]`
`@@ -75,7 +75,7 @@`
`75`	`75`	`"cell_type": "markdown",`
`76`	`76`	`"metadata": {},`
`77`	`77`	`"source": [`
`78`		`- "...it's important to remember to lean on vectorization and skip the loops altogether."`
	`78`	`+ "...it's important to remember to lean on vectorization and skip the loops altogether. Vectorization is faster and as you can see, easier to read and write."`
`79`	`79`	`]`
`80`	`80`	`},`
`81`	`81`	`{`
`@@ -119,7 +119,7 @@`
`119`	`119`	`"metadata": {},`
`120`	`120`	`"source": [`
`121`	`121`	`"### Broadcasting a Scalar\n",`
`122`		`- "Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same opration."`
	`122`	`+ "Also just like NumPy arrays, the mathematical operators have been overridden to use the vectorized versions of the same operation."`
`123`	`123`	`]`
`124`	`124`	`},`
`125`	`125`	`{`
`@@ -227,7 +227,7 @@`
`227`	`227`	`"cell_type": "markdown",`
`228`	`228`	`"metadata": {},`
`229`	`229`	`"source": [`
`230`		- "#### Using the `fill_value`\n",
	`230`	+ "#### Using the `fill_value` parameter\n",
`231`	`231`	"It is possible to fill missing values so that everything aligns. The concept is to use the `add` method directly along with the the keyword argument `fill_value`."
`232`	`232`	`]`
`233`	`233`	`},`
Original file line number	Diff line number	Diff line change
`@@ -37,7 +37,7 @@`
`37`	`37`	`"cell_type": "markdown",`
`38`	`38`	`"metadata": {},`
`39`	`39`	`"source": [`
`40`		`- "If your data is already in rows and columns you can just pass it along to the constructor. Labels and Column headings will be automatically generated as a range."`
	`40`	`+ "If your data is already in rows and columns, like a list of lists, you can just pass it along to the constructor. Labels and Column headings will be automatically generated as a range."`
`41`	`41`	`]`
`42`	`42`	`},`
`43`	`43`	`{`