Upload CSV File in Python

In this tutorial you will learn how to parse a CSV file (Comma Separated Values), convert your Assets to Woosmap required structure and import them using Woosmap Data API.

Prerequisites

To run this tutorial, you’ll need:

  • Python 2.7 or greater.
  • The pip package management tool.
  • Access to the internet.
  • A Woosmap account.

CSV Formatting

Comma Separated Values (or TSV, Tab-Separated Values) files can be imported to Woosmap. For a successful import, follow these formatting guidelines:

  • The first line of the CSV file must contain the name of the columns
  • The rest of the lines of the CSV file must follow the schema defined by the header column, in terms of number of columns

Example: sample CSV file comma separated with the name of the columns at first line

Latitude,Longitude,Name,Address Line
51.919948,4.486843,Markthal Rotterdam,Dominee Jan Scharpstraat 298

CSV Python Module

Because Woosmap API only accepts JSON as input, the next step is to transform each row of your file to Woosmap elements. The following python script is based on the native csv module. It makes it easier to deal with csv formatted file, especially when working with data exported from spreadsheets and databases into text files. We preferred to iterate over the data rows using csv.DictReader instead of classic reader mainly because it takes advantage of the header row with the column names and gives you the opportunity to grab it cell value by its name.

Dialect

There is no well-defined standard for comma-separated value files, there are plenty of ways one CSV file can differ from another, yet contains exactly the same data. Many tools, which can import or export tabular data allow the user to indicate the field delimiter, quote character, line terminator, and other characteristics of the file. All these parameters are grouped together conveniently into a Dialect object.

When creating a csv.DictReader object, the programmer can specify a subclass of the Dialect class as the dialect parameter. In our CSV file the fields are separated by commas and some values are double quoted.

import csv

class MyCSVDialect(csv.Dialect):
    delimiter = ','
    quotechar = '"'
    doublequote = True
    skipinitialspace = False
    lineterminator = '\n'
    quoting = csv.QUOTE_ALL

with open(file_path, 'r') as csv_file:
    reader = csv.DictReader(csv_file, dialect=MyCSVDialect())

Mappping Data Structure

The one step further is to transform each row element to an Asset object that follows the required structured format. The mandatory fields are a storeId (as String), a name and a location geometry object (a couple of Latitude and Longitude). Below is the method used to transform each row of our CSV to a Woosmap Asset element:

from hashlib import sha1

def get_name(asset):
    return asset.get('Name', '')

def generate_id(asset):
    asset_id = sha1(get_name(asset)).hexdigest()
    return asset_id

def get_contact(asset):
    return {
        'website': asset.get('Website', ''),
        'phone': asset.get('Contact Phone', ''),
        'email': asset.get('Contact Email', '')
    }

def get_geometry(asset):
    return {
        'lat': float(asset.get('Latitude', None)),
        'lng': float(asset.get('Longitude', None))
    }

def get_address(asset):
    return {
        'lines': [asset.get('Address Line', '')],
        'city': asset.get('City', ''),
        'country': asset.get('Country', ''),
        'zipcode': asset.get('Zipcode', '')
    }

def convert_to_woosmap(asset):
    return {
        'storeId': generate_id(asset),
        'name': get_name(asset),
        'address': get_address(asset),
        'contact': get_contact(asset),
        'location': get_geometry(asset)
    }

We are now able to open the file and iterate over the data to build an array of all the Woosmap Asset elements before POST them to Woosmap API:

with open(file_path, 'r') as csv_file:
    reader = csv.DictReader(csv_file, dialect=MyCSVDialect())
    woosmap_assets = []
    for asset in reader:
        converted_asset = convert_to_woosmap(asset)
        woosmap_assets.append(converted_asset)

Work With Woosmap API

The Woosmap Data API is a RESTful API with endpoint https://api.woosmap.com/stores. It takes a mandatory parameter : your private key. Therefore the url you call should look like this: htps://api.woosmap.com/stores?private_key=YOUR_PRIVATE_KEY.

We use Requests to call the Woosmap API. Simply run this command in your terminal of choice to install it using pip:

$ pip install requests

The API provides 4 HTTP methods for interacting with resources but in our case we are interested in creating or replacing new resources so we’ll use the POST method after executed a DELETE. Don’t forget to replace the WOOSMAP_PRIVATE_API_KEY with your own.

import requests

WOOSMAP_PRIVATE_API_KEY = '23713926-1af5-4321-ba54-032966f6e95d'

class Woosmap:
    """A wrapper around the Woosmap Data API."""
    WOOSMAP_API_HOSTNAME = 'api.woosmap.com'

    def __init__(self):
        self.session = requests.Session()

    def delete(self):
        self.session.delete('https://{hostname}/stores/'.format(hostname=self.WOOSMAP_API_HOSTNAME),
                            params={'private_key': WOOSMAP_PRIVATE_API_KEY})

    def post(self, payload):
        return self.session.post('https://{hostname}/stores/'.format(hostname=self.WOOSMAP_API_HOSTNAME),
                                 params={'private_key': WOOSMAP_PRIVATE_API_KEY},
                                 json={'stores': payload})

    def end(self):
        self.session.close()

The Woosmap Data API allows you to update your stores using PUT http verb. The update method is based on unique identifier of your data so be careful to keep the same identifier. Also, you are able to replace in one API call the whole project Assets by calling the endpoint /stores/replace with POST method. Check the full documentation for more detail.

Batch Import

If you have more than 1000 Assets to manage, we recommend you to set a rolling import every X elements. To do this, define a numerical constant in your code (BATCH_SIZE = 5) and execute your POST request each time this counter value is reached. In our script code, we’re using a slightly more different way as we import the Assets with chunk method. See below.

import range

BATCH_SIZE = 5

def batch(assets_data, n=1):
    l = len(assets_data)
    for ndx in range(0, l, n):
        yield assets_data[ndx:min(ndx + n, l)]

for chunk in batch(woosmap_assets, BATCH_SIZE):
    imported_success = import_assets(chunk, woosmap_api_helper)
    if imported_success:
        count_imported_assets += len(chunk)

Your BATCH_SIZE should actually be set closer to 100. If you remove this constant from the script code, your Assets will be imported one by one.

Exceptions Handling

An exception is an error that happens during execution of your program. When that error occurs, Python generate an exception that can be handled, which avoids your program to crash.

with open(file_path, 'r') as csv_file:
    try:
        reader = csv.DictReader(csv_file, dialect=MyCSVDialect())
    except csv.Error as csv_error:
        print('Error in CSV file found: {0}'.format(csv_error))

You can also raise an exception by using the raise exception statement. It’s really useful for handling potential errors that doesn’t generate exceptions, but will break your program, like a 400 response status code when calling the Woosmap API or an empty "Name" in your Assets source. When we do this, it has exactly the same effect as any other exception.

def get_name(asset):
    name = asset.get('Name', '')
    if name:
        return name
    else:
        raise ValueError('Unable to get the Name')


Related help articles


Search Results

algolia