for any info/changes follow me: @nickmilon

mongoUtils.importsExports module¶

Classes used to import/export data to mongoDB

mongoUtils.importsExports.import_workbook(workbook, db, fields=None, ws_options={'dt_python': True}, stats_every=1000)[source]¶

save all workbook’s sheets to a db consider using ImportXls class instead which is more flexible but imports only a single sheet

Parameters:

see ImportXls class

Example:

>>> from pymongo import MongoClient
>>> from mongoUtils import _PATH_TO_DATA
>>> db = MongoClient().test
>>> res = import_workbook(_PATH_TO_DATA + "example_workbook.xlsx", db)
>>> res
[{'rows': 368, 'db': 'test', 'collection': 'weather'}, {'rows': 1007, 'db': 'test', 'collection': 'locations'}]

class mongoUtils.importsExports.Import(collection, drop_collection=True, stats_every=10000)[source]¶

Bases: object

generic class for importing into a mongoDB collection, successors should use/extend this class

Parameters:	db: a pynongo database object that will be used for output collection: a pymongo collection object that will be used for output drop_collection: (defaults to True) True drops output collection on init before writing to it False appends to output collection stats_every: int print import stats every stats_every rows or 0 to cancel stats (defaults to 10000)

format_stats = '|{db:16s}|{collection:16s}|{rows:15,d}|'¶

format_stats_header = '...................................................\n| db | collection | rows |\n...................................................'¶

__init__(collection, drop_collection=True, stats_every=10000)[source]¶

import_to_collection()[source]¶: successors should implement this

_import_to_collection_before()[source]¶: successors can call this or implement their’s

_import_to_collection_after()[source]¶: successors can call this or implement their’s

print_stats()[source]¶

class mongoUtils.importsExports.ImportXls(workbook, sheet, db, coll_name=None, row_start=None, row_end=None, fields=True, ws_options={'negatives_to_0': False, 'dt_python': True, 'integers_only': False}, stats_every=10000, drop_collection=True)[source]¶

Bases: mongoUtils.importsExports.Import

save an an xls sheet to a collection see

Parameters:

workbook: path to a workbook or an xlrd workbook object
sheet: name of a work sheet in workbook or an int (sheet number in workbook)
db: a pymongo database object
coll_name: str output collection name or None to create name from sheet name (defaults to None)
row_start: int or None starting raw or None to start from first row (defaults to None)
row_end:int or None ending raw or None to end at lastrow (defaults to None)
fields:
- a list with field names
- or True (to treat first row as field names)
- or None (for auto creating field names i.e: [fld_1, fld_2, etc]
- or a function that:
  
  takes one argument (a list of row values)
  
  returns a dict (if this dict contains a key ‘_id’ this value will be used for _id)
  
  >>> lambda x: {'coordinates': [x[0] , x[1]]}
ws_options: (optional) a dictionary specifying how to treat cell values
- dt_python : bool convert dates to python datetime
- integers_only : round float values to int helpful coz all int values are represented as floats in sheets
- negatives_to_0 : treat all negative numbers as 0’s
drop_collection: (defaults to True)
- True drops output collection on init before writing to it
- False appends to output collection
stats_every: int print import stats every stats_every rows or 0 to cancel stats (defaults to 10000)
drop_collection: if True drops collection on init otherwise appends to collection

Example:

>>> from pymongo import MongoClient
>>> from mongoUtils import _PATH_TO_DATA
>>> db = MongoClient().test
>>> res = ImportXls(_PATH_TO_DATA + "example_workbook.xlsx", 0, db)()
>>> res
{'rows': 367, 'db': u'test', 'collection': u'weather'}

__init__(workbook, sheet, db, coll_name=None, row_start=None, row_end=None, fields=True, ws_options={'negatives_to_0': False, 'dt_python': True, 'integers_only': False}, stats_every=10000, drop_collection=True)[source]¶

ws_options¶

ws_options_set(options_dict)[source]¶

fix_name(name, cnt=0)[source]¶

auto_field_names(fields)[source]¶

row_to_doc(valueslist, _id=None)[source]¶

ws_convert_cell(cl)[source]¶

Parameters:	cl an xlrd cell object

import_to_collection()[source]¶

mongoUtils.importsExports module¶

Previous topic

Next topic