By Ibrahim Rahimi


2019-02-09 07:03:02 8 Comments

I am a beginner Python developer and I wrote a script in Python which fetches 50 latest jobs from jobs.af (a jobs portal) API, categories them based on jobs gender and write each to a separate CSV and Excel sheet files. I want my code cleaner and more readable, so I really like to get comments on code structure and how to make the code more readable and nice.

#! /usr/bin/python
'''This is a simple command line python program which fetches maximum 50 latest
    jobs from jobs.af API and accept two optional arguments (--category='job category
    --title='job title') and can filter jobs bassed on them, then it prints the result
    to a .xlsxworksheet with three sheets Male, Female and Any according the gender of
    jobs.
'''

import urllib2
import json
import sys
import csv
import xlsxwriter
import argparse

# Create an ArgumentParser
parser = argparse.ArgumentParser(description = 'Fetch and list maximum 50 latest\
                                jobs from "jobs.af" based on title, category, with \
                                both of them or with out of them.'
                                )
# Create arguments using argparse object
parser.add_argument('--category', help = "takes job category name or it's id ")
parser.add_argument('--title' , help = 'takes job title as string')

# Some variables used for flag.
job_title = ''
job_category = ''
flag = True

# Use tyr except to handle arguments parsing.
try:
    parser.parse_args([])
    args = parser.parse_args()

    # Assgin command line arguments to variables to pass them to urlBuilder method
    job_category = args.category
    job_title = args.title
except:
    flag = False
    print 'please enter your search like this patter: --category="catgory name" \
            --title="title name"'

# General url for jobs.af API
url = 'http://api.jobs.af/jobs?filter=1&per_page=50'

# Create the url(filter the request) to get data from jobs.af API
def url_builder(category = None, title = None):
    if category and title:
        title_query = title and '&position_title=' + title.replace(' ', '%20') or ''
        category_query = category and '&category=' + category.replace(' ', '%20') or ''
        global url
        return url + category_query + title_query

    elif category and not title:
        category_query = category and '&category=' + category.replace(' ', '%20') or ''
        return url + category_query

    elif title and not category:
        title_query = title and '&position_title=' + title.replace(' ', '%20') or ''
        return url + title_query

    else:
        url = 'http://api.jobs.af/jobs?per_page=50'
        return url


'''Get data from API as json object and get the specific parts of jobs and print them to
   a worksheet in differen sheet according to gender.
'''
def list_jobs(query):
    # Use urllib2 to load data as a json object.
    json_object = urllib2.urlopen(query)
    json_data = json.load(json_object)

    # Create a workboo using xlsxwriter to write data in it.
    workbook = xlsxwriter.Workbook('listJobs.xlsx')

    male_sheet = workbook.add_worksheet('Male')
    male_sheet.write_row('A1',['PSITION TITILE', 'SKILLS', 'EXPIRE-DATE',
                               'GENDER', 'LOCATION', 'CATEGORY'
                               ])

    female_sheet = workbook.add_worksheet('Female')
    female_sheet.write_row('A1',['PSITION TITILE', 'SKILLS', 'EXPIRE-DATE',
                                 'GENDER', 'LOCATION', 'CATEGORY'
                                 ])

    any_sheet = workbook.add_worksheet('Any')
    any_sheet.write_row('A1',['PSITION TITILE', 'SKILLS', 'EXPIRE-DATE',
                              'GENDER', 'LOCATION', 'CATEGORY'
                              ])

    # Open a CSV file.
    csv_file = open('jobs.csv', 'a')

    # Create an object of csv.writer to write to a csv file.
    csv_writer = csv.writer(csv_file)

    # Write to CSV file.
    csv_writer.writerow(['Position Title', 'skill', 'Expire Date', 'Gender',
                         'Location', 'Category'
                         ])

    # Counters
    any_counter = 1
    female_counter = 1
    male_counter = 1
    count = 0
    k = 0

    # Loop over dictionary to fetch jobs attributes 
    for item in json_data['data']:
        # Get items and encode and decode them to write items to xlsx files. 
        title = item['position_title'].encode('utf-8')
        dtitle = title.decode('unicode-escape')
        skills = item['skills_requirement'].encode('utf-8')
        dskills = skills.decode('unicode-escape')
        expire = item['expire_date'].encode('utf-8')
        dexpire = expire.decode('unicode-escape')
        gender = item['gender'].encode('utf-8')
        dgender = gender.decode('unicode-escape')

        loc = item.get('location').get('data')
        state = ''
        for i in range(len(loc)):
            province = loc[i] 
            state = state + province['name_en'].encode('utf-8')
            dstate = state.decode('unicode-escape')

        category = item.get('category').get('data')
        category = category['name_en'].decode('utf-8')
        dcategory = category.decode('unicode-escape')
        # Update counter for counting number of jobs that are ftching.
        count = count + 1

        # Get gender attribute and check it to specify the sheet to write in to it.
        gender = item['gender']

        if gender == 'Male':
            male_sheet.write_row(male_counter,k,[dtitle, dskills, dexpire,
                                                dgender, dstate, dcategory
                                                ])
            male_counter = male_counter + 1

        elif gender == 'Female':
            female_sheet.write_row(female_counter, k,[dtitle, dskills, dexpire,
                                                     dgender, dstate, dcategory
                                                     ])
            female_counter = female_counter + 1

        else:
            any_sheet.write_row(any_counter, k,[dtitle, dskills, dexpire, dgender,
                                               dstate, dcategory
                                               ])
            any_counter = any_counter + 1

        # Write to CSV file 
        csv_writer.writerow([title, skills, expire, gender, state, category])

    # Close workbook
    workbook.close()

    # Prompt for user based on the result of fetching of jobs from jobs.af
    result1 = ''
    result2 = ''
    if job_category == None:
        result1 = 'any category'
    else:
        result1 = job_category

    if job_title == None:
        result2 = 'any title.'
    else:
        result2 = job_title


    if count == 0:
        print 'No job/s were/was found in jobs.af for category: ' + str(result1) + \
              ' and title: ' + str(result2)
    elif job_category == None and job_title == None:
        print str(count) + '  latest jobs founded in jobs.af for category: ' + str(result1) + \
              ' and title: ' + str(result2) + ' were writen to listJobs.xlsx.'

        print str( any_counter -1 ) + ' of founded job/s are/is for any gender.'
        print str(male_counter -1) + ' of founded job/s are/is for males.'
        print str(female_counter -1) + ' of founded job/s are/is for females.'
    else:
        print str(count) + ' job/s were/was found in jobs.af for category: ' + str(result1) + \
              ' and title: ' + str(result2) + ' were writen to listJobs.xlsx.'

        print str( any_counter -1 ) + ' of founded job/s are/is for any gender.'
        print str(male_counter -1) + ' of founded job/s are/is for males.'
        print str(female_counter -1) + ' of founded job/s are/is for females.'


if flag == True:
    # Call urlBuilder method and assgin it's returned url to url variable
    url_query = url_builder(job_category, job_title)
    # Call listJobs method with the epecified URL
    list_jobs(url_query)

else:
    print 'Run program with correct argument pattern'

1 comments

@Graipher 2019-02-09 10:03:35

As alluded to in the comments by @Peilonrayz, using the requests module would simplify your URL building and reading the result of the request:

import requests

url = 'http://api.jobs.af/jobs'
params = {"filter": 1, "per_page": 50, "category": category, "title": title}
json_data = requests.get(url, params=params).json()

If category or title are None, they will just be skipped and both the parameter names as well as the values will be URL encoded (so not only spaces replaced by %20, but also all other possible entities).


In your argument parsing you use a try..except block in which you first parse without any arguments, then try to parse what the user supplied and then output a help message if it is not correct.

This is wrong on almost all accounts. You should not need the first empty parsing, you should basically never use a bare except clause (it also catches things like the user pressing Ctrl+C) and argparse will already generate an error message for you on wrong input. In addition with your version the program will continue running with invalid parameters. Instead it should just fail and stop right there.


docstrings should go into the scope of the function:

def f(a, b):
    """Sums the values `a` and `b`."""
    return a + b

This way you can actually access it:

>>> print f.__doc__
# Sums the values `a` and `b`.

>>> help(f)
# Help on function f in module __main__:
# 
# f(a, b)
#     Sums the values `a` and `b`.

For the writing to excel sheets I would use a less manual approach. pandas has data frames and a to_excel method:

import pandas as pd

def combine_location(row):
    return " ".join(x["name_en"] for x in row['data'])

df = pd.DataFrame(json_data["data"])
df = df[["position_title", "skills_requirement", "expire_date", "gender",
         "location", "category"]]
df["location"] = df.location.apply(combine_location)
df["category"] = df.category.apply(lambda row: row["data"]["name_en"])
df.columns = ['Position Title', 'skill', 'Expire Date', 'Gender', 'Location',
              'Category']

writer = pd.ExcelWriter('listJobs.xlsx')
gdf = df.groupby("Gender", as_index=False)
gdf.apply(lambda df: df.to_excel(writer, df.iloc[0].Gender, index=False))
writer.save()

print "Number of jobs for:"
for gender, jobs in gdf.groups.items():
    print gender, len(jobs)

You should put your code that executes the rest under a if __name__ == "__main__": guard so that you can import from this script without executing everything.


Python 2 will stop being officially supported in less than a year. Now is a good time to switch to Python 3.

@Ibrahim Rahimi 2019-02-10 06:49:27

Thanks for your nice comments, suggestions and solutions. I will change my python version very soon. I really like to learn python deep, do you have any suggestion on which modules and libraries I should work with?

@Graipher 2019-02-10 07:20:11

@IbrahimRahimi The itertools module is a must. If you want to do webscraping look at bs4.BeautifulSoup and if you want to do anything related to data analysis there is no way around numpy[(http://www.numpy.org/) and [pandas.

Related Questions

Sponsored Content

1 Answered Questions

1 Answered Questions

0 Answered Questions

Read and write to multiple json files

  • 2018-05-18 20:02:36
  • user3577138
  • 1041 View
  • 2 Score
  • 0 Answer
  • Tags:   python json

2 Answered Questions

[SOLVED] Maths quiz that read/writes CSV files

  • 2016-02-02 18:32:19
  • Richard Fletcher
  • 1846 View
  • 1 Score
  • 2 Answer
  • Tags:   python csv quiz

1 Answered Questions

[SOLVED] Convert XML files to JSON and move them

2 Answered Questions

[SOLVED] Read daily files and concatenate them

  • 2015-09-07 16:28:03
  • trench
  • 641 View
  • 4 Score
  • 2 Answer
  • Tags:   python pandas

2 Answered Questions

[SOLVED] Generating .xml files based on .csv files

  • 2013-06-05 14:35:37
  • ax850
  • 2681 View
  • 3 Score
  • 2 Answer
  • Tags:   python xml csv

2 Answered Questions

[SOLVED] Concatenating rows in CSV files and removing old CSV files

1 Answered Questions

[SOLVED] Read from csv file, transform to dictionnary

1 Answered Questions

[SOLVED] Creating a pythonic snippet to read and clean .csv files

  • 2013-12-13 10:18:28
  • Ram Kumar
  • 621 View
  • 6 Score
  • 1 Answer
  • Tags:   python django csv

Sponsored Content