<!-- Event snippet for Docker Install conversion page -->
<script>
gtag('event', 'conversion', {'send_to': 'AW-1072338079/TajECOL0tIsBEJ-pqv8D'});
</script>
Download and Install
<ol>
<li style="padding-bottom: 10px;">First install Docker onto your PC. Installation instructions can be found at the following locations:
<ul>
<li><a style="color: blue;" href="https://store.docker.com/editions/community/docker-ce-desktop-windows">Windows 10 Professional or Enterprise 64-bit</a></li>
<li><a style="color: blue;" href="https://store.docker.com/editions/community/docker-ce-desktop-mac">Mac OS Yosemite 10.10.3 or above</a></li>
<li><a style="color: blue;" href="https://docs.docker.com/toolbox/overview/">Older Mac OS or Windows</a></li>
<li><a style="color: blue;" href="https://docs.docker.com/install/">Linux (CentOS, Debian, Ubuntu, Fedora)</a></li>
</ul>
</li>
</ol>
<p style="padding-left: 30px;">(Note: It may be necessary to enable hyper virtualization from the BIOS for some Windows based machines. See <a style="color: blue;" href="https://docs.docker.com/docker-for-windows/troubleshoot/#virtualization">here</a> for more details.)</p>
<ol start="2">
<li style="padding-bottom: 10px;">Check your Docker settings to ensure that you have enough memory allocated to run PivotBillions. (minimum 2 GB)
<img class="size-large wp-image-793 aligncenter" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/docker-advanced.png" alt="" width="640" height="414" /></li>
<li style="padding-bottom: 10px;">Open a shell (for Windows 10 use Powershell or cmd.exe)</li>
<li style="padding-bottom: 10px;">Pull the PivotBillions container from Docker:
<pre style="background-color: lightgray;">>docker pull auriqsystems/pivotbillions</pre>
<em>For docker on Windows, make sure to switch to <u>Linux Containers</u> in your settings, otherwise you will be unable to pull PivotBillions.</em></li>
<li style="padding-bottom: 10px;">Run PivotBillions
<pre style="background-color: lightgray;">>docker run -dit -p 80:3000 --name="pb" auriqsystems/pivotbillions /bin/bash
>docker exec -d pb bash /home/start-server.sh</pre>
</li>
<li style="padding-bottom: 10px;">Open a browser window and enter: <strong>http://localhost/index.html</strong></li>
<li style="padding-bottom: 10px;">You should see the PivotBillions UI in your browser
<img class="size-large wp-image-588 aligncenter" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker01.png" alt="" width="640" height="288" /></li>
</ol>
Importing Data
<ol>
<li style="padding-bottom: 10px;">Click on the plus symbol located in the upper right of the the data selection box to show data import options.<br />
<img class="size-large wp-image-592 aligncenter" style="padding: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker05.png" alt="" width="640" height="188" /></li>
<li style="padding-bottom: 10px;">The Drag & drop option lets you add files stored locally on your PC. The From URL option lets you specify files accessible online by entering one or more URLs into an input box.</li>
<li style="padding-bottom: 10px;">After you've added your new data files either from local storage or from an online source, you will see the new data files in the data selection box.<br />
<img class="size-large wp-image-595 aligncenter" style="padding: 5px;" src="https://pivotbillions.co.jp/wp-content/uploads/2019/06/pb-docker08.png" alt="" width="640" height="189" /></li>
</ol>
Load Data to Reports
<ol>
<li style="padding-bottom: 10px;">From the Pivot Billions data selection box, select the Main data file you want to load from the selection list and then click on the Preview button.<br />
<img class="aligncenter wp-image-589 size-large" style="padding-bottom: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker02.png" alt="" width="640" height="354" /></li>
<li style="padding-bottom: 10px;">A preview of the columns will load beneath. You can change the column labels or data type as well as select your data keys here.</li>
<li style="padding-bottom: 10px;">By default the Skip Errors option is selected. This instructs Pivot Billions to skip rows with errors in them. Click on the Import button to load the data.</li>
<li style="padding-bottom: 10px;">You should now see the report table with the selected sample data loaded.<br />
<img class="size-large wp-image-590 aligncenter" style="padding-bottom: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker03.png" alt="" width="640" height="353" /><br />
Once the sample data has been loaded, you can begin interacting and analyzing the data from the Report UI.</li>
</ol>
Sorts, Filters and Distributions
<p><strong>Sorting</strong></p>
<ol>
<li style="padding-bottom: 10px;">Hover over a column name in the Report table and click on the Sort icon <img class="alignnone size-full wp-image-872" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/tools_sort.png" alt="" width="108" height="51" /> in the tools overlay.</li>
<li style="padding-bottom: 10px;">The selected column will initially sort in descending order. Click the sort icon again to sort in ascending order. The sort direction will be displayed next to the column name as shown below.<br />
<img class="wp-image-873 size-full aligncenter" style="padding-bottom: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/sort.png" alt="" width="416" height="226" /></li>
</ol>
<p><strong>Filtering (Global)</strong></p>
<p>Applying global filters affect not only the main report table but any generated distribution graphs or pivot tables as well.</p>
<ol>
<li style="padding-bottom: 10px;">Hover over a column name in the Report table and click on the Filter icon <img class="alignnone size-full wp-image-875" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/tools_filt.png" alt="" width="108" height="51" /> in the tools overlay.</li>
<li style="padding-bottom: 10px;">The Filter Condition box will open allowing you to enter the criteria for your filter. After inputting your filter criteria, click enter on your keyboard to implement.<br />
<img class="wp-image-876 aligncenter" style="padding-bottom: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter01.png" alt="" width="450" height="230" /><br />
You can apply multiple filters across different data columns and within the same column (e.g. "Greater than 1" and "Less than 10")</li>
<li>Filtered columns will display the filter symbol next to the column label.<br />
<img class="aligncenter wp-image-997 size-full" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter02-1.png" alt="" width="458" height="216" /></li>
<li>To change the Logical operation of filters (AND, OR), click on the Filter icon located in the header of the UI then click on the "and" text between the filter conditions. This will toggle the logic from AND to OR.<br />
<img class="wp-image-998 size-full aligncenter" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter03.png" alt="" width="851" height="173" /><br />
<img class="wp-image-999 size-full aligncenter" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter04.png" alt="" width="849" height="169" /></li>
<li>To delete or remove a filter, click on the <strong>(x)</strong> next to the filter condition to remove a single filter, or click on the trash icon to remove all filters.<br />
<img class="wp-image-1006 size-full aligncenter" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter05.png" alt="" width="849" height="169" /></li>
</ol>
<p><strong>Filtering (Local)</strong></p>
<p>Applying local filters affect only the distribution graphs or pivot tables that the filter is entered from.</p>
<ol>
<li>Within a distribution graph or pivot table, click on the plus symbol (+) found usually at the top of the table, chart or graph.<br />
<img class="aligncenter wp-image-1008 size-full" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter06.png" alt="" width="432" height="210" /></li>
<li>Use the dropdowns and fields that appear to enter your filter conditions.<img class="aligncenter wp-image-1009 size-full" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/filter07.png" alt="" width="484" height="324" /></li>
<li>Just like global filters, you can enter multiple local filters at the same time. In addition, you can click on the "and" text to change the logic to an "or" condition.</li>
<li>To remove any local filters, simply click on the (x) next to the filter condition.</li>
</ol>
<p><strong>Distribution Graphs</strong></p>
<ol>
<li style="padding-bottom: 10px;">Hover over a column name in the Report table and click on the Graph icon <img class="alignnone wp-image-930 size-full" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/tools_dist.png" alt="" width="88" height="37" /> in the tools overlay.</li>
<li style="padding-bottom: 10px;">Directly beneath the Report table, a Distribution graph will be generated based on the column data selected. If the data is numeric, then the graph will show distribution of counts by range. If the data is string, then the counts for each unique value will be shown. <img class="size-large wp-image-931 aligncenter" style="padding-bottom: 5px;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-dist.png" alt="" width="640" height="280" /></li>
<li style="padding-bottom: 10px;">Clicking on the Table icon <img class="alignnone size-full wp-image-939" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/tools_dist2.png" alt="" width="121" height="45" />will show the distribution data represented in table format.</li>
<li style="padding-bottom: 10px;">For string data, clicking on the Pie Chart icon <img class="alignnone size-full wp-image-940" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/tools_dist3.png" alt="" width="121" height="45" /> will show the distribution data as a pie chart.</li>
</ol>
Combining Multiple Files
<p>Often times, the data you want to analyze might be split over multiple files. In order to analyze all the data as a cohesive data set, you must first combine the files. In Pivot Billions, you can combine data files that have the same schema or data structure by following these steps.</p>
<ol>
<li style="padding-bottom: 10px;">From the data selection box, click on the check boxes for each file you want to combine under the Main heading. In this case we are using the two New York taxi data files already included with Pivot Billions.<br />
<img class="alignnone wp-image-598" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker11.png" alt="" width="800" height="252" /></li>
<li style="padding-bottom: 10px;">Select the Preview button to view the column labels and data types and click Import when you are ready.</li>
<li style="padding-bottom: 10px;">The combined data files will be loaded into the report table. Notice that the combined row count is twice that of a single data file.<br />
<img class="alignnone wp-image-599" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker12.png" alt="" width="800" height="453" /></li>
<li style="padding-bottom: 10px;">You can combine as many files as necessary, but please keep in mind that capacity and performance are dependent on your docker system resource settings. For larger data sets, it may require that you allocate more memory and/or cpu to process efficiently.</li>
</ol>
Joining Files
<p>When you have different types of data files that are connected through common keys, Pivot Billions allows you to Join these files in a left join fashion. This is very useful if you want to integrate a lookup table to your primary data set. In the following example, we will combine the two sample New York taxi data files and then join the taxi zone lookup table file located <a style="color: blue;" href="https://s3.amazonaws.com/nyc-tlc/misc/taxi+_zone_lookup.csv" target="_blank" rel="noopener">here</a> to the combined data set.</p>
<p>To Join data, follow these steps:</p>
<ol>
<li>Import the taxi zone lookup file by entering the URL in the data selection box. Click on the Go button to import the file.<br />
<img class="alignnone wp-image-602" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker13.png" alt="" width="800" height="326" /></li>
<li style="padding-bottom: 10px;">Once the lookup file has loaded, click on the check boxes under the Main heading for the taxi data as shown, and then click on the check box under the Join heading for the taxi zone lookup file you just loaded.<br />
<img class="alignnone wp-image-603" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker14.png" alt="" width="800" height="232" /></li>
<li style="padding-bottom: 10px;">Select the Preview button to view the column labels and data types. Notice that there is a warning message above the schema preview that indicates there is no matching key column between the Main data set and the Join file. In the Main data set, there is a column labeled PULocationID while in the Join data set there is a column labeled LocationID.<br />
<img class="alignnone wp-image-604" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker15.png" alt="" width="800" height="466" /></li>
<li style="padding-bottom: 10px;">Change the LocationID label in the Join data set to PULocationID as shown. Notice that the warning message disappears after a matching Key column has been identified.<br />
<img class="alignnone wp-image-605" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker16.png" alt="" width="800" height="447" /></li>
<li style="padding-bottom: 10px;">Click on the Import button to load both the Main data set and the Join data set. Once it has loaded into the report table, slide right until you see the newly joined data columns of Borough, Zone and service_zone.<br />
<img class="alignnone wp-image-606" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker17.png" alt="" width="800" height="432" /></li>
<li>At this point you can now select these columns as dimensions for analysis in pivot tables, as well as perform all other report functions. In this example we demonstrated joining one lookup table file, but you can Join multiple files as necessary following the same steps.</li>
</ol>
Analyzing Sample Data
<p>There are two sample data sets provided with Pivot Billions Docker version. One contains currency data and the other contain New York taxi data. To load either follow these steps:</p>
<ol>
<li style="list-style-type: none;">
<ol>
<li style="padding-bottom: 10px;">From the Pivot Billions data selection box, select the file you want to load from the selection list.</li>
<li style="padding-bottom: 15px;">For this example select the green_tripdata_2017-01.csv.gz file, and then click on the Preview button.<br />
<img class="alignnone wp-image-589" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker02.png" alt="" width="800" height="442" /></li>
<li style="padding-bottom: 15px;">A preview of the columns will load beneath. You can change the column labels or data type as well as select your data keys here.</li>
<li>By default the Skip Errors option is selected. This instructs Pivot Billions to skip rows with errors in them. Click on the Import button to load the data.</li>
<li style="padding-bottom: 15px;">You should now see the report table with the selected sample data loaded.<br />
<img class="alignnone wp-image-590" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-docker03.png" alt="" width="800" height="440" /></li>
</ol>
</li>
</ol>
<p>Once the sample data has been loaded, you can begin interacting and analyzing the data from the Report UI. From here, you can sort, filter, add columns and create pivot tables. The following steps goes through a basic exercise to show how to analyze the New York taxi sample data.</p>
<ol>
<li style="padding-bottom: 15px;">Select the <strong>Column View Configuration</strong> icon and click on the <strong>Select None</strong> box.<br />
<img class="alignnone wp-image-549" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample01.png" alt="" width="700" height="176" /></li>
<li style="padding-bottom: 15px;">Click on the <strong>pickup_datetime</strong>, <strong>pickup_location_id</strong>, and <strong>trip_distance</strong> column labels and then click anywhere outside the configuration box.<br />
<img class="alignnone wp-image-551" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample02-1.png" alt="" width="700" height="232" /><br />
Your table should now show only the three columns selected.</li>
<li style="padding-bottom: 15px;">Click on the <strong>Add Column</strong> icon and enter the following:<br />
<strong>Label:</strong> ymd<br />
<strong>Format:</strong> string(s)<br />
<strong>ESS Syntax:</strong> substr(pickup_datetime,0,10)<br />
<img class="alignnone wp-image-552" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample03.png" alt="" width="750" height="409" /><br />
The settings shown above extracts the year-month-day from the original pickup_datetime column. For the purpose of our analysis, we don’t want to include the time information from the original column data.</li>
<li style="padding-bottom: 15px;"><strong>Save</strong> the new column. Your table should now show four columns.<br />
<img class="alignnone size-full wp-image-553" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample04.png" alt="" width="642" height="320" /></li>
<li style="padding-bottom: 15px;">Click on the <strong>Pivot</strong> icon and select <strong>pickup_location_id</strong> and <strong>ymd</strong> for your dimensions and select <strong>trip_distance</strong> for your value, then click <strong>View</strong>.<br />
<img class="alignnone wp-image-555" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample05-1.png" alt="" width="750" height="242" /></li>
<li style="padding-bottom: 15px;">A new table will be generated below with aggregated values for the selected dimensions.<br />
<img class="alignnone wp-image-556" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample06.png" alt="" width="750" height="428" /></li>
<li style="padding-bottom: 15px;">Click on the <strong>View Type</strong> icon to switch to <strong>Pivot View</strong>.<br />
<img class="alignnone wp-image-557" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample07.png" alt="" width="750" height="195" /></li>
<li style="padding-bottom: 15px;">Drag the <strong>pickup_location_id</strong> field label to the row area and then drag the <strong>ymd</strong> field to the columns area.<br />
<img class="alignnone wp-image-558" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample08.png" alt="" width="750" height="226" /></li>
<li style="padding-bottom: 15px;">Change the data value from <strong>Count</strong> to <strong>Summation</strong>.<br />
<img class="alignnone size-full wp-image-559" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample09.png" alt="" width="483" height="281" /><br />
This changes the data value to the sum of all distances calculated for a pickup_location_id|ymd pair.</li>
<li style="padding-bottom: 15px;">Sort the pivot table by largest total value by clicking on the up down arrow until you see the up arrow.<br />
<img class="alignnone size-full wp-image-560" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample10.png" alt="" width="526" height="295" /><br />
We see that the largest total trip_distances are associated with pickup_location_id’s of 74 and 255.</li>
<li style="padding-bottom: 15px;">Click on the <strong>pickup_location_id</strong> field label to see a list of field values, and choose <strong>Select None</strong> to deselect all the values.<br />
<img class="alignnone wp-image-562" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample12.png" alt="" width="400" height="338" /></li>
<li style="padding-bottom: 15px;">Next individually select 74 and 255 and click Apply. Your pivot table should like like this:<br />
<img class="alignnone wp-image-563" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample13.png" alt="" width="750" height="226" /></li>
<li style="padding-bottom: 15px;">Click on the chart selection list and select the Line Chart option.<br />
<img class="alignnone wp-image-564" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample14.png" alt="" width="400" height="261" /></li>
<li style="padding-bottom: 15px;">You should now see a line chart of total trip distances logged per day for each pick up location id.<br />
<a href="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample15.png"><img class="alignnone wp-image-565" style="border: 1px solid;" src="https://pivotbillions.auriq.co.jp/wp-content/uploads/2019/06/pb-sample15.png" alt="" width="750" height="293" /></a><br />
Notice that 74 has a much more consistent and narrow range for distance, while 255 has a much wider range. The large peaks for 255 are attributed to weekend pick ups.</li>
</ol>