{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Queries reprised: asking your data questions\n",
    "\n",
    "Much of this week has focused on data preparation: combining datasets together, reconfiguring and aggregating them into groups, all for the purpose of preparing the data to help us develop insights and ask questions. In this lesson, we'll be returning to how we can programmatically ask questions of our datasets using queries. We have seen all of this before when we discussed subsetting our data and filtering based on different logical expressions. Here we will introduce a compact way to execute such queries to add another tool into your programmatic toolbox for working with data. A few of you may have encountered structured query language (SQL) before, and if so, there will be many connections to what we discuss today.\n",
    "\n",
    "Let's start with our sales data from a previous lesson:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Guanyu</td>\n",
       "      <td>17</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Jan</td>\n",
       "      <td>6</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Jacqueline</td>\n",
       "      <td>8</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Paola</td>\n",
       "      <td>3</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Esperanza</td>\n",
       "      <td>7</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Alaina</td>\n",
       "      <td>15</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Egweyn</td>\n",
       "      <td>5</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     employee  sales  year\n",
       "0     Katrina     14  2018\n",
       "1      Guanyu     17  2019\n",
       "2         Jan      6  2020\n",
       "3       Roman     12  2018\n",
       "4  Jacqueline      8  2020\n",
       "5       Paola      3  2019\n",
       "6   Esperanza      7  2019\n",
       "7      Alaina     15  2020\n",
       "8      Egweyn      5  2020"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "pd.set_option(\"mode.copy_on_write\", True)\n",
    "\n",
    "sales = pd.DataFrame(\n",
    "    data={\n",
    "        \"employee\": [\n",
    "            \"Katrina\",\n",
    "            \"Guanyu\",\n",
    "            \"Jan\",\n",
    "            \"Roman\",\n",
    "            \"Jacqueline\",\n",
    "            \"Paola\",\n",
    "            \"Esperanza\",\n",
    "            \"Alaina\",\n",
    "            \"Egweyn\",\n",
    "        ],\n",
    "        \"sales\": [14, 17, 6, 12, 8, 3, 7, 15, 5],\n",
    "        \"year\": [2018, 2019, 2020, 2018, 2020, 2019, 2019, 2020, 2020],\n",
    "    }\n",
    ")\n",
    "sales"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To query a `pandas` DataFrame, we need to use the `query` method which takes a query string. This string allows you to ask a world of questions of your data for example, see the table below for a set of examples of quieries and their corresponding query strings for our `sales` dataset:\n",
    "\n",
    "| Query      | Query string |\n",
    "| ----------- | ----------- |\n",
    "| Show me sales greater than 10 | \"sales > 10\" |\n",
    "| Show me data from 2018 | \"year == 2018\" |\n",
    "| Show me sales are greater than 13 and the year is 2018| \"sales > 13 and year == 2018\" |\n",
    "| Show me everything EXCEPT for when sales are greater than 13 and the year is 2018 | \"not (sales > 13 and year == 2018)\" |\n",
    "| Show me data where sales divided by 3 are greater than 3 | \"sales/3 > 3\" |\n",
    "| Show me employees whose names are alphabetically after J | \"employee > 'J'\" |\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's show each of these queries in action below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Guanyu</td>\n",
       "      <td>17</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Alaina</td>\n",
       "      <td>15</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018\n",
       "1   Guanyu     17  2019\n",
       "3    Roman     12  2018\n",
       "7   Alaina     15  2020"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query(\"sales > 10\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018\n",
       "3    Roman     12  2018"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query(\"year == 2018\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query(\"sales > 13 and year == 2018\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Guanyu</td>\n",
       "      <td>17</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Jan</td>\n",
       "      <td>6</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Jacqueline</td>\n",
       "      <td>8</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Paola</td>\n",
       "      <td>3</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Esperanza</td>\n",
       "      <td>7</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Alaina</td>\n",
       "      <td>15</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Egweyn</td>\n",
       "      <td>5</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     employee  sales  year\n",
       "1      Guanyu     17  2019\n",
       "2         Jan      6  2020\n",
       "3       Roman     12  2018\n",
       "4  Jacqueline      8  2020\n",
       "5       Paola      3  2019\n",
       "6   Esperanza      7  2019\n",
       "7      Alaina     15  2020\n",
       "8      Egweyn      5  2020"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query(\"not (sales > 13 and year == 2018)\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Guanyu</td>\n",
       "      <td>17</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Alaina</td>\n",
       "      <td>15</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018\n",
       "1   Guanyu     17  2019\n",
       "3    Roman     12  2018\n",
       "7   Alaina     15  2020"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query(\"sales/3 > 3\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Jan</td>\n",
       "      <td>6</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Jacqueline</td>\n",
       "      <td>8</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Paola</td>\n",
       "      <td>3</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     employee  sales  year\n",
       "0     Katrina     14  2018\n",
       "2         Jan      6  2020\n",
       "3       Roman     12  2018\n",
       "4  Jacqueline      8  2020\n",
       "5       Paola      3  2019"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales.query('employee > \"J\"')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, these same queries could have been accomplished using selection and filtering concept we discussed earlier, as shown below, but requiring more sytax (and the `query` method is often more computationally efficient). Let's try rewriting `sales.query('sales > 13 and year == 2018')` using the techniques we described earlier:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sales[(sales[\"sales\"] > 13) & (sales[\"year\"] == 2018)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `query` method makes our code far easier to follow.\n",
    "\n",
    "Another nice feature of `query` is that we can bring in variables from our local workspace, but outside of the DataFrame by prefixing the `@` symbol to the variable name. Let's say we knew that each sale was worth $1,000. Then we could query based on the total dollar amount of sales as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>employee</th>\n",
       "      <th>sales</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Katrina</td>\n",
       "      <td>14</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Guanyu</td>\n",
       "      <td>17</td>\n",
       "      <td>2019</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Roman</td>\n",
       "      <td>12</td>\n",
       "      <td>2018</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Alaina</td>\n",
       "      <td>15</td>\n",
       "      <td>2020</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  employee  sales  year\n",
       "0  Katrina     14  2018\n",
       "1   Guanyu     17  2019\n",
       "3    Roman     12  2018\n",
       "7   Alaina     15  2020"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "revenue_per_sale = 1000\n",
    "sales.query(\"sales * @revenue_per_sale > 10000\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computation with columns\n",
    "\n",
    "In a similar way that we were able to create more readable and efficient queries with the `query` method, `pandas` also has a faster, more readable method for performing a number of computations on columns. let's create a DataFrame with some data to demonstrate:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>speed_initial</th>\n",
       "      <th>speed_final</th>\n",
       "      <th>time_initial</th>\n",
       "      <th>time_final</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.271065</td>\n",
       "      <td>0.720802</td>\n",
       "      <td>0.796269</td>\n",
       "      <td>0.433519</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.312085</td>\n",
       "      <td>0.349255</td>\n",
       "      <td>0.258158</td>\n",
       "      <td>0.833784</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.986399</td>\n",
       "      <td>0.129630</td>\n",
       "      <td>0.059127</td>\n",
       "      <td>0.246308</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.542444</td>\n",
       "      <td>0.295594</td>\n",
       "      <td>0.250011</td>\n",
       "      <td>0.102992</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.226410</td>\n",
       "      <td>0.670819</td>\n",
       "      <td>0.388360</td>\n",
       "      <td>0.600059</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   speed_initial  speed_final  time_initial  time_final\n",
       "0       0.271065     0.720802      0.796269    0.433519\n",
       "1       0.312085     0.349255      0.258158    0.833784\n",
       "2       0.986399     0.129630      0.059127    0.246308\n",
       "3       0.542444     0.295594      0.250011    0.102992\n",
       "4       0.226410     0.670819      0.388360    0.600059"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "rand_matrix = np.random.rand(5, 4)\n",
    "data = pd.DataFrame(\n",
    "    rand_matrix, columns=[\"speed_initial\", \"speed_final\", \"time_initial\", \"time_final\"]\n",
    ")\n",
    "data_copy = data.copy()  # Make a copy to use later\n",
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In physics, we can calculate acceleration of an object by it's change in velocity divided by the time it took to change that velocity. Let's calculate that using the tools we've discussed alread:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>speed_initial</th>\n",
       "      <th>speed_final</th>\n",
       "      <th>time_initial</th>\n",
       "      <th>time_final</th>\n",
       "      <th>acceleration</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.271065</td>\n",
       "      <td>0.720802</td>\n",
       "      <td>0.796269</td>\n",
       "      <td>0.433519</td>\n",
       "      <td>-1.239797</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.312085</td>\n",
       "      <td>0.349255</td>\n",
       "      <td>0.258158</td>\n",
       "      <td>0.833784</td>\n",
       "      <td>0.064573</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.986399</td>\n",
       "      <td>0.129630</td>\n",
       "      <td>0.059127</td>\n",
       "      <td>0.246308</td>\n",
       "      <td>-4.577205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.542444</td>\n",
       "      <td>0.295594</td>\n",
       "      <td>0.250011</td>\n",
       "      <td>0.102992</td>\n",
       "      <td>1.679045</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.226410</td>\n",
       "      <td>0.670819</td>\n",
       "      <td>0.388360</td>\n",
       "      <td>0.600059</td>\n",
       "      <td>2.099245</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   speed_initial  speed_final  time_initial  time_final  acceleration\n",
       "0       0.271065     0.720802      0.796269    0.433519     -1.239797\n",
       "1       0.312085     0.349255      0.258158    0.833784      0.064573\n",
       "2       0.986399     0.129630      0.059127    0.246308     -4.577205\n",
       "3       0.542444     0.295594      0.250011    0.102992      1.679045\n",
       "4       0.226410     0.670819      0.388360    0.600059      2.099245"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[\"acceleration\"] = (data[\"speed_final\"] - data[\"speed_initial\"]) / (\n",
    "    data[\"time_final\"] - data[\"time_initial\"]\n",
    ")\n",
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Instead we can use the `eval` function, which has a syntax similar to that of `query`, but lets us do a number of common computations more clearly (and more quickly, since the code will run more quickly as well):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>speed_initial</th>\n",
       "      <th>speed_final</th>\n",
       "      <th>time_initial</th>\n",
       "      <th>time_final</th>\n",
       "      <th>acceleration</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.271065</td>\n",
       "      <td>0.720802</td>\n",
       "      <td>0.796269</td>\n",
       "      <td>0.433519</td>\n",
       "      <td>-1.239797</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.312085</td>\n",
       "      <td>0.349255</td>\n",
       "      <td>0.258158</td>\n",
       "      <td>0.833784</td>\n",
       "      <td>0.064573</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.986399</td>\n",
       "      <td>0.129630</td>\n",
       "      <td>0.059127</td>\n",
       "      <td>0.246308</td>\n",
       "      <td>-4.577205</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.542444</td>\n",
       "      <td>0.295594</td>\n",
       "      <td>0.250011</td>\n",
       "      <td>0.102992</td>\n",
       "      <td>1.679045</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.226410</td>\n",
       "      <td>0.670819</td>\n",
       "      <td>0.388360</td>\n",
       "      <td>0.600059</td>\n",
       "      <td>2.099245</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   speed_initial  speed_final  time_initial  time_final  acceleration\n",
       "0       0.271065     0.720802      0.796269    0.433519     -1.239797\n",
       "1       0.312085     0.349255      0.258158    0.833784      0.064573\n",
       "2       0.986399     0.129630      0.059127    0.246308     -4.577205\n",
       "3       0.542444     0.295594      0.250011    0.102992      1.679045\n",
       "4       0.226410     0.670819      0.388360    0.600059      2.099245"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_copy.eval(\n",
    "    \"acceleration = (speed_final - speed_initial) / (time_final - time_initial)\",\n",
    "    inplace=True,\n",
    ")\n",
    "data_copy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, we could prefix a local variable name with the `@` symbol to incorporate the variable into the `eval` expression. While not appropriate for cases which use complex functions, `eval` is a helpful tool for simple expressions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "While none of the ends that were achieved by the tools in this section were unable to be reached using the tools from earlier parts of this course, the `query` and `eval` tools discussed here help to make that process easier to understand, easier to implement, and more efficient for processing."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.2 ('base')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.2"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "c97f18ea1c0f4969cf594a5df9f14ba2a838cf106fc5300ddecce29d9d6f0c71"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}