Skip to content

Setting Up Local SQL Database

Sebastian Zimmeck edited this page Jun 21, 2024 · 16 revisions

1. Install and Configure XAMPP

Install xampp from here: https://sourceforge.net/projects/xampp/files/XAMPP%20Mac%20OS%20X/8.0.2/. After installing, there will be xampp folder in your Applications. Open xamppfiles folder, and find the manager-osx file of type application. Open it and start MySQL database and Apache Web Server under manage servers.

Screenshot 2023-06-01 at 10 22 17 PM

2. Set up SQL Database

Go to http://localhost/phpmyadmin/ to set up the database. Click New and create a database called “analysis”. Then, with the analysis database selected, navigate to the SQL tab.

Screenshot 2023-06-01 at 10 28 46 PM

Paste in the following query and click “Go” at the bottom of the page to create a table named “entries”:

CREATE TABLE entries (id INTEGER PRIMARY KEY AUTO_INCREMENT, site_id INTEGER, domain varchar(255), sent_gpc BOOLEAN, gpp_version TEXT, uspapi_before_gpc varchar(255), uspapi_after_gpc varchar(255), usp_cookies_before_gpc varchar(255), usp_cookies_after_gpc varchar(255), OptanonConsent_before_gpc varchar(255), OptanonConsent_after_gpc varchar(255), gpp_before_gpc TEXT, gpp_after_gpc TEXT, urlClassification TEXT, OneTrustWPCCPAGoogleOptOut_before_gpc BOOLEAN, OneTrustWPCCPAGoogleOptOut_after_gpc BOOLEAN, OTGPPConsent_before_gpc TEXT, OTGPPConsent_after_gpc TEXT);

3. Debugging Table (optional)

This is an optional table that can be set up using the command:

CREATE TABLE debug (id INTEGER PRIMARY KEY AUTO_INCREMENT, domain varchar(255), a varchar(4000), b varchar(4000));

This table is used as a console log since we have not found a good way to have a console log for the analysis extension during crawls. The purpose of the debugging table is to help identify bugs in the analysis extension as well as help people understand the flow of how a site is analyzed. The columns allow for logging the domain as well as 2 other things of choice. In general, the debugging statements we have included log the line number in column a and the function that contains this line in column b. These statements are used to confirm that the functions are running as expected. The function in analysis.js that posts to the debug table is called post_to_debug.

To use the debugging table after you set it up, you must run the rest api with:

node index.js debug

If you don't want to use the debugging table, run the rest api with the usual command:

node index.js 

Your choice to use or not use the debugging table will not influence the performance of the crawler.