.. _getting-started: =============== Getting started =============== Preparing and publishing a Pattrn instance is quick and easy: in this section of the Pattrn manual you will learn: * how to retrieve the Pattrn source code * how to configure it to use your dataset of events (we will use a sample dataset for this tutorial; later sections of this manual explain in detail how to check, clean up and configure your own data for Pattrn) * how to publish your first Pattrn app on the web Working through this example step by step should take around ten minutes. After working through this tutorial you will have a fully working Pattrn instance with the sample dataset and you should be able to repeat the process with your own datasets. If you require any assistance please get in touch with the Pattrn team (see the Contacts section of this manual). Later sections of this manual provide more advanced guidance on: * how to set up a Pattrn instance using Google Sheets to host the Pattrn dataset * how to set up and use the Pattrn Editor to allow crowdsourcing of event data * how to fully configure a Pattrn instance with all the advanced features available for complex data science projects Prerequisites ------------- Before starting this *getting started* tutorial, you need to sign up for a free account on the `Netlify `_ content publishing platform. We will use the Netlify service to publish the Pattrn app and data, by dragging-and-dropping the Pattrn files from a local folder to the Netlify app in the browser. It will also be helpful—especially for Windows users—to have a *text editor* installed on your computer if you don't have one already: if in doubt, we advise to install the free and open source `Atom editor `_. Please note that a visual document editor such as LibreOffice Write or Microsoft Word *will not allow you to edit the files needed to configure Pattrn saving them in the required, plain-text format* (you could actually do this using specific settings, but it's much preferrable to use a proper *text editor* instead. Step one: Getting the Pattrn code --------------------------------- Download the latest Pattrn source code here: https://github.com/pattrn-project/pattrn/releases Click on the ``zip`` (for Windows users) or ``tar.gz`` (for macOS or GNU/Linux users) download link next to the most recent release to download an archive with the Pattrn app code to your computer. Extract the contents of the archive and navigate to the extracted folder (its name is ``pattrn``). Within the Pattrn source code, we will be working inside of the subfolder called ``dist``: this contains just the Step two: Link the Pattrn app to a dataset ------------------------------------------ A brief overview of the sample dataset ...................................... For this tutorial we will use a sample dataset from an earlier research project carried out at Forensic Architecture, the research group that is home to the Pattrn project. The *Where The Drones Strike* project analysed and mapped drone strikes in Pakistan, using data collected by the `Bureau of Investigative Journalism `_. The analysis was visualised interactively at http://wherethedronesstrike.com; for this tutorial we will use the dataset used for the investigation and visualise it on a Pattrn instance. The source dataset provided by the Bureau of Investigative Journalism is `available on the BIJ website `_ as a Google Sheets database. For this tutorial, the Pattrn team has exported a subset of the full dataset as a `GeoJSON file `_, suitable to be packaged directly in a Pattrn instance. Whereas the full dataset contains a few dozen variables, here we will be using only a handful, to show how Pattrn can visualise different types of variables. Pattrn can handle four types of variables: * **integer**: these are basically *counts* of things; for example number of casualties, number of buildings hit in an attack, etc. * **tag**: these are labels that describe traits of an event; for example the target type of an attack ('civilian', 'infrastructure', 'hospital', etc.), the perpetrator group, etc. For this variable type zero, one or more labels can be used, separated by a comma, e.g. "civilian, infrastructure"; "infrastructure"; "hospital, infrastructure". * **boolean**: these are variables that represent a question that can be answered with either a *yes* or *no*; for example, whether an attack resulted in structural damage to buildings (the answer could be yes or no, or perhaps unknown or unreported, but we will not deal with the latter cases for the moment). * **tree**: these are variables that describe traits of an event (like variables of the tags type above), but where the traits can be logically organised in a tree-like structure, where parent tag nodes group together children tag nodes that are sub-traits of the parent tag. This is an advanced variable type that will not be covered in this tutorial (more detailed information is provided in later sections of the Pattrn manual). Within these four types of variables, the sample dataset carries variables of three types: * **integer**: * ``casualties_min``: this is the count of casualties (minimum reported) caused by a drone strike * ``children_casualties_min``: this is the count of children casualties (minimum reported) * **tag**: * ``time_period``: whether the attack was reported as having been carried out in the morning, afternoon, evening or night (the source dataset only includes the day an attack happened and this rough indication of the time of the day rather than an exact time) * ``tribal_agency``: tribal agency target of the drone strike * **boolean**: * ``structural_damage``: whether structural damage was observed as consequence of a drone strike Download the dataset .................... We will start by downloading the GeoJSON file here: https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/data.geojson Firstly we need to create a folder for data and metadata within the Pattrn source folder: making sure you are in the ``dist`` folder of the Pattrn folder extracted in the previous step, create a new folder and give it the name ``data``. Now copy the file just downloaded (``data.geojson``) inside of the ``data`` folder. We now need to modify a few files included in the Pattrn distribution in order to configure our Pattrn instance to correctly use this dataset, before publishing the full app to the web. .. _getting-started-metadata: Pattrn metadata: configuring variables ...................................... Besides a few core variables that need to be present in every dataset used with Pattrn (``latitude`` and ``longitude`` for locating events, ``date_time`` to track when they happened), most of the other variables will typically have names and meanings specific to each dataset: in order to let our Pattrn instance visualise each variable correctly (using a chart type appropriate to the variable type, such as line charts for integer variables, bar charts for tag variables, etc.) and so that descriptive variable names are displayed to visitors, we need to prepare a simple *metadata* file that will instruct Pattrn on how to use each variable. The easiest way to do this is to create a *plain text* file using the Atom editor installed when preparing for this tutorial: create a new file with name ``metadata.yaml`` inside of the ``data`` folder where you saved the dataset earlier, and add the following content to the file:: --- document_schema: id: "pattrn_metadata" version: 2 variables: integer: - id: "casualties_min" name: "people reported killed (minimum)" - id: "children_casualties_min" name: "children reported killed (minimum)" - id: "injured_min" name: "injured (minimum)" tag: - id: "time_period" name: "time period" - id: "tribal_agency" name: "tribal agency" boolean: - id: "structural_damage" name: "structural damage observed" The initial ``document_schema`` section simply instructs Pattrn to interpret this as a Pattrn metadata file with version '2' (to differentiate this from earlier metadata file formats used in previous versions of Pattrn). The dataset-specific variables are configured in the main ``variables`` section starting on line 5, in sub-sections broken down by variable type. Here we configure the three variables of type *integer* listed earlier, and associate to each of them a brief, descriptive label that is then used in the visualised data. Likewise, we configure two variables of type *tag* and one variable of type *boolean*. When configuring your own dataset's variables later on, you may want to use this example ``metadata.yaml`` file as a starting point, and edit it to match the variables in your dataset, payint attention to the following: * the file format requires the various sections to be "nested" as seen in the example above, using increasing indentation (two whitespaces more for each nesting level) * each variable definition needs to be "nested" within the appropriate variable type * if no variables of a given type are present in your dataset, you can remove the corresponding section altogether (in the example above, there is no ``tree`` section as we don't have any variable of type *tree*) * for each variable, you **must** provide an ``id``, which **must match exactly** the name of the variable as defined in your dataset * for each variable, you **may** provide a ``name``: a brief (50 characters or less descriptive label to be used instead of the variable's ``id`` in the Pattrn visualisations For convenience, you may want to download a ready-made ``metadata.yaml`` file for this dataset here: https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/metadata.yaml However, we recommend to work through the example above step by step in order to understand how to adapt the example to your own dataset. If you are having trouble preparing a ``metadata.yaml`` file for your own datasets, please get in touch with the Pattrn team (see the Contacts section of this manual). Configure basic settings for the Pattrn instance ................................................ Pattrn provides a few configurable settings that can be used to fine-tune the appearance of each Pattrn instance. A title and subtitle/tagline can be suppliedm and some details of the map can be configuered too, such as marker colour, marker opacity, minimum and maximum allowed zoom levels, whether and how markers should be clustered together at certain zoom levels in order to improve the map's legibility. Pattrn will use sensible defaults for most settings; however editors will most likely want to configure a title and a subtitle. In order to do so, create a new *plain text* file within the same ``data`` folder where you already placed the data file and metadata file in the previous steps. Name this file ``settings.json`` and set its content as in the example below:: { "title" : "Where the drones strike", "subtitle" : "Spatial analysis of drone strikes in the frontier regions of Pakistan", "map": { "root_selector": "chart-map", "markers": { "color": "black", "fillColor": "black", "opacity": "0.8" }, "zoom": { "max": 15, "min": 6 }, "disableClusteringAtZoom": 17 } } For convenience, you may want to download a ready-made ``settings.json`` file for this sample instance here: https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/settings.json For your own future Pattrn instances, you will likely want to customise the ``title``and ``subtitle`` settings. If omitting all the rest of the settings, Pattrn will use its defaults. Linking all the configuration files to the Pattrn instance .......................................................... You now have data, metadata and settings files in place, all within the ``data`` subfolder of the ``dist`` folder within the Pattrn source code. A final step is needed, in order to let the code for this Pattrn instance know *where* to find the data, medata and settings files. For this tutorial we are packaging all these files together with the Pattrn source code that will be published to a website at the next step; however, Pattrn can indeed use data (and metadata and settings) files hosted elsewhere: this is what allows to plug into Pattrn dynamic datasets that are constantly growing with new data being supplied or edited. By default, Pattrn will attempt to retrieve its core files from default locations; however it is always preferable to explicitly configure the location of these core files. The Pattrn source code comes with a ``config.json`` pre-configured with these default locations: here we will just review these settings by opening the ``config.json`` file (which can be found directly within the ``dist`` folder (not within the ``dist/data`` folder where we have been editing the previous files):: { "data_sources": { "geojson_data" : { "data_url" : "/data/data.geojson", "settings_url" : "/data/settings.json", "metadata_url" : "/data/metadata.json" } } } The default configuration file above is: * configuring a single data source ... * ... that is using the GeoJSON format ... * ... and whose data, metadata and settings files are named respectively ``data.geojson``, ``metadata.yaml`` and ``settings.json`` and are packaged directly with the Pattrn app, within the ``data`` folder, as we have done above Congratulations! Your first Pattrn instance is now fully configured and we can now proceed to publishing it to the web. Step three: Publishing the Pattrn app ------------------------------------- Log in to your Netlify account at https://netlify.com/. Navigate to the list of your sites on Netlify: https://app.netlify.com/sites. Here you can create and publish a new website simply by dragging a folder from your computer to the area at the bottom of this page (the area surrounded by a dashed line, with the message "Need to share a quick prototype or publish a simple mockup? Drag a folder with a static site here."). In order to publish the Pattrn app you have just finished configuring, navigate to the Pattrn folder and drag **its** ``dist`` **folder** to the target area on the Netlify sites page. As the files are being uploaded and processed by the Netlify web app, visual feedback on the progress will be displayed on the web page, until the Pattrn app is fully published. Its web address will be displayed on the site's configuration page within the Netlify dashboard. By default a random address such as https://secure-synapse-12345.netlify.com will be allocated by Netlify, and this can be customised later through the Netlify dashboard. Creating a new Pattrn instance with your own data ------------------------------------------------- Congratulations! Your first Pattrn instance is now published and its data can be explored online. If you wish to create a new instance *using your own dataset* now, you may wish to go through this tutorial again, but using your own GeoJSON data file, metadata and settings instead. If you wish to make sure your dataset is ready for use with Pattrn, we recommend that you read the section of this manual about *Preparing your data for Pattrn* first.