.. _getting-started:

===============
Getting started
===============

Preparing and publishing a Pattrn instance is quick and easy: in this
section of the Pattrn manual you will learn:

* how to retrieve the Pattrn source code
* how to configure it to use your dataset of events (we will use a
  sample dataset for this tutorial; later sections of this manual
  explain in detail how to check, clean up and configure your own
  data for Pattrn)
* how to publish your first Pattrn app on the web

Working through this example step by step should take around ten
minutes. After working through this tutorial you will have a fully
working Pattrn instance with the sample dataset and you should be
able to repeat the process with your own datasets.

If you require any assistance please get in touch with the Pattrn
team (see the Contacts section of this manual).

Later sections of this manual provide more advanced guidance on:

* how to set up a Pattrn instance using Google Sheets to host the
  Pattrn dataset
* how to set up and use the Pattrn Editor to allow crowdsourcing of
  event data
* how to fully configure a Pattrn instance with all the advanced
  features available for complex data science projects

Prerequisites
-------------

Before starting this *getting started* tutorial, you need to sign up
for a free account on the `Netlify <https://netlify.com>`_ content
publishing platform. We will use the Netlify service to publish
the Pattrn app and data, by dragging-and-dropping the Pattrn files
from a local folder to the Netlify app in the browser.

It will also be helpful—especially for Windows users—to have a
*text editor* installed on your computer if you don't have one
already: if in doubt, we advise to install the free and open
source `Atom editor <https://atom.io/>`_. Please note that a
visual document editor such as LibreOffice Write or Microsoft
Word *will not allow you to edit the files needed to configure
Pattrn saving them in the required, plain-text format* (you
could actually do this using specific settings, but it's much
preferrable to use a proper *text editor* instead.

Step one: Getting the Pattrn code
---------------------------------

Download the latest Pattrn source code here:

https://github.com/pattrn-project/pattrn/releases

Click on the ``zip`` (for Windows users) or ``tar.gz`` (for macOS
or GNU/Linux users) download link next to the most recent release
to download an archive with the Pattrn app code to your computer.

Extract the contents of the archive and navigate to the extracted
folder (its name is ``pattrn``).

Within the Pattrn source code, we will be working inside of the
subfolder called ``dist``: this contains just the 

Step two: Link the Pattrn app to a dataset
------------------------------------------

A brief overview of the sample dataset
......................................

For this tutorial we will use a sample dataset from an earlier
research project carried out at Forensic Architecture, the
research group that is home to the Pattrn project.

The *Where The Drones Strike* project analysed and mapped drone
strikes in Pakistan, using data collected by the `Bureau of
Investigative Journalism <https://www.thebureauinvestigates.com/>`_.

The analysis was visualised interactively at http://wherethedronesstrike.com;
for this tutorial we will use the dataset used for the investigation
and visualise it on a Pattrn instance.

The source dataset provided by the Bureau of Investigative Journalism
is `available on the BIJ website
<https://www.thebureauinvestigates.com/stories/2017-01-01/drone-wars-the-full-data>`_
as a Google Sheets database. For this tutorial, the Pattrn team has
exported a subset of the full dataset as a
`GeoJSON file <https://geojson.org/>`_, suitable to be packaged directly
in a Pattrn instance.

Whereas the full dataset contains a few dozen variables, here we will
be using only a handful, to show how Pattrn can visualise different types
of variables.

Pattrn can handle four types of variables:

* **integer**: these are basically *counts* of things; for example number
  of casualties, number of buildings hit in an attack, etc.
* **tag**: these are labels that describe traits of an event; for example
  the target type of an attack ('civilian', 'infrastructure',
  'hospital', etc.), the perpetrator group, etc. For this variable type
  zero, one or more labels can be used, separated by a comma, e.g.
  "civilian, infrastructure"; "infrastructure"; "hospital, infrastructure".
* **boolean**: these are variables that represent a question that can
  be answered with either a *yes* or *no*; for example, whether an
  attack resulted in structural damage to buildings (the answer could
  be yes or no, or perhaps unknown or unreported, but we will not deal
  with the latter cases for the moment).
* **tree**: these are variables that describe traits of an event (like
  variables of the tags type above), but where the traits can be
  logically organised in a tree-like structure, where parent tag nodes
  group together children tag nodes that are sub-traits of the parent
  tag. This is an advanced variable type that will not be covered in
  this tutorial (more detailed information is provided in later sections
  of the Pattrn manual).

Within these four types of variables, the sample dataset carries
variables of three types:

* **integer**:

  * ``casualties_min``: this is the count of casualties (minimum reported)
    caused by a drone strike
  * ``children_casualties_min``: this is the count of children casualties
    (minimum reported)

* **tag**:

  * ``time_period``: whether the attack was reported as having been
    carried out in the morning, afternoon, evening or night (the source
    dataset only includes the day an attack happened and this rough
    indication of the time of the day rather than an exact time)
  * ``tribal_agency``: tribal agency target of the drone strike

* **boolean**:

  * ``structural_damage``: whether structural damage was observed as
    consequence of a drone strike

Download the dataset
....................

We will start by downloading the GeoJSON file here:
https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/data.geojson

Firstly we need to create a folder for data and metadata within the Pattrn
source folder: making sure you are in the ``dist`` folder of the Pattrn
folder extracted in the previous step, create a new folder and give it
the name ``data``.

Now copy the file just downloaded (``data.geojson``) inside of the
``data`` folder.

We now need to modify a few files included in the Pattrn distribution
in order to configure our Pattrn instance to correctly use this dataset,
before publishing the full app to the web.

.. _getting-started-metadata:

Pattrn metadata: configuring variables
......................................

Besides a few core variables that need to be present in every dataset used
with Pattrn (``latitude`` and ``longitude`` for locating events, ``date_time``
to track when they happened), most of the other variables will typically have
names and meanings specific to each dataset: in order to let our Pattrn
instance visualise each variable correctly (using a chart type appropriate
to the variable type, such as line charts for integer variables, bar charts
for tag variables, etc.) and so that descriptive variable names are
displayed to visitors, we need to prepare a simple *metadata* file that
will instruct Pattrn on how to use each variable.

The easiest way to do this is to create a *plain text* file using the
Atom editor installed when preparing for this tutorial: create a new
file with name ``metadata.yaml`` inside of the ``data`` folder where
you saved the dataset earlier, and add the following content to the file::

    ---
    document_schema: 
        id: "pattrn_metadata"
        version: 2
    variables: 
        integer: 
        - 
            id: "casualties_min"
            name: "people reported killed (minimum)"
        - 
            id: "children_casualties_min"
            name: "children reported killed (minimum)"
        - 
            id: "injured_min"
            name: "injured (minimum)"
        tag: 
        - 
            id: "time_period"
            name: "time period"
        - 
            id: "tribal_agency"
            name: "tribal agency"
        boolean: 
        - 
            id: "structural_damage"
            name: "structural damage observed"

The initial ``document_schema`` section simply instructs Pattrn to interpret this
as a Pattrn metadata file with version '2' (to differentiate this from earlier
metadata file formats used in previous versions of Pattrn).

The dataset-specific variables are configured in the main ``variables`` section
starting on line 5, in sub-sections broken down by variable type. Here we
configure the three variables of type *integer* listed earlier, and associate
to each of them a brief, descriptive label that is then used in the visualised
data. Likewise, we configure two variables of type *tag* and one variable of
type *boolean*.

When configuring your own dataset's variables later on, you may want to use this
example ``metadata.yaml`` file as a starting point, and edit it to match the
variables in your dataset, payint attention to the following:

* the file format requires the various sections to be "nested" as seen in the
  example above, using increasing indentation (two whitespaces more for each
  nesting level)
* each variable definition needs to be "nested" within the appropriate variable
  type
* if no variables of a given type are present in your dataset, you can remove
  the corresponding section altogether (in the example above, there is no
  ``tree`` section as we don't have any variable of type *tree*)
* for each variable, you **must** provide an ``id``, which **must match
  exactly** the name of the variable as defined in your dataset
* for each variable, you **may** provide a ``name``: a brief (50 characters or
  less descriptive label to be used instead of the variable's ``id`` in the
  Pattrn visualisations

For convenience, you may want to download a ready-made ``metadata.yaml``
file for this dataset here:
https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/metadata.yaml

However, we recommend to work through the example above step by step in order
to understand how to adapt the example to your own dataset.

If you are having trouble preparing a ``metadata.yaml`` file for your own
datasets, please get in touch with the Pattrn team (see the Contacts section
of this manual).

Configure basic settings for the Pattrn instance
................................................

Pattrn provides a few configurable settings that can be used to fine-tune
the appearance of each Pattrn instance. A title and subtitle/tagline can
be suppliedm and some details of the map can be configuered too, such as
marker colour, marker opacity, minimum and maximum allowed zoom levels,
whether and how markers should be clustered together at certain zoom
levels in order to improve the map's legibility.

Pattrn will use sensible defaults for most settings; however editors will
most likely want to configure a title and a subtitle. In order to do so,
create a new *plain text* file within the same ``data`` folder where
you already placed the data file and metadata file in the previous steps.
Name this file ``settings.json`` and set its content as in the
example below::

    {
    "title" : "Where the drones strike",
    "subtitle" : "Spatial analysis of drone strikes in the frontier regions of Pakistan",
    "map": {
            "root_selector": "chart-map",
            "markers": {
                "color": "black",
                "fillColor": "black",
                "opacity": "0.8"
            },
            "zoom": {
                "max": 15,
                "min": 6
            },
            "disableClusteringAtZoom": 17
        }
    }

For convenience, you may want to download a ready-made ``settings.json`` file
for this sample instance here:
https://gitlab.com/pattrn-data/pattrn-data-where-the-drones-strike/raw/pattrn-data/pattrn-data/data/pattrn-data-where-the-drones-strike/settings.json

For your own future Pattrn instances, you will likely want to customise the
``title``and ``subtitle`` settings. If omitting all the rest of the settings,
Pattrn will use its defaults.

Linking all the configuration files to the Pattrn instance
..........................................................

You now have data, metadata and settings files in place, all within the
``data`` subfolder of the ``dist`` folder within the Pattrn source code.

A final step is needed, in order to let the code for this Pattrn instance
know *where* to find the data, medata and settings files.

For this tutorial we are packaging all these files together with the
Pattrn source code that will be published to a website at the next
step; however, Pattrn can indeed use data (and metadata and settings)
files hosted elsewhere: this is what allows to plug into Pattrn
dynamic datasets that are constantly growing with new data being
supplied or edited. By default, Pattrn will attempt to retrieve
its core files from default locations; however it is always preferable
to explicitly configure the location of these core files.

The Pattrn source code comes with a ``config.json`` pre-configured
with these default locations: here we will just review these
settings by opening the ``config.json`` file (which can be found
directly within the ``dist`` folder (not within the ``dist/data``
folder where we have been editing the previous files)::

    {
        "data_sources": {
            "geojson_data" : {
                "data_url" : "/data/data.geojson",
                "settings_url" : "/data/settings.json",
                "metadata_url" : "/data/metadata.json"
            }
        }
    }

The default configuration file above is:

* configuring a single data source ...
* ... that is using the GeoJSON format ...
* ... and whose data, metadata and settings files are named
  respectively ``data.geojson``, ``metadata.yaml`` and
  ``settings.json`` and are packaged directly with the Pattrn app,
  within the ``data`` folder, as we have done above

Congratulations! Your first Pattrn instance is now fully configured and
we can now proceed to publishing it to the web.

Step three: Publishing the Pattrn app
-------------------------------------

Log in to your Netlify account at https://netlify.com/.

Navigate to the list of your sites on Netlify: https://app.netlify.com/sites.

Here you can create and publish a new website simply by dragging a folder
from your computer to the area at the bottom of this page (the area surrounded
by a dashed line, with the message "Need to share a quick prototype or publish
a simple mockup? Drag a folder with a static site here.").

In order to publish the Pattrn app you have just finished configuring,
navigate to the Pattrn folder and drag **its** ``dist`` **folder** to the
target area on the Netlify sites page.

As the files are being uploaded and processed by the Netlify web app, visual
feedback on the progress will be displayed on the web page, until the Pattrn
app is fully published. Its web address will be displayed on the site's
configuration page within the Netlify dashboard. By default a
random address such as https://secure-synapse-12345.netlify.com will be
allocated by Netlify, and this can be customised later through the Netlify
dashboard.

Creating a new Pattrn instance with your own data
-------------------------------------------------

Congratulations! Your first Pattrn instance is now published and its data
can be explored online. If you wish to create a new instance *using your
own dataset* now, you may wish to go through this tutorial again, but
using your own GeoJSON data file, metadata and settings instead. If you
wish to make sure your dataset is ready for use with Pattrn, we recommend
that you read the section of this manual about *Preparing your data for
Pattrn* first.