for any info/changes follow me: @nickmilon
mongoUtils.importsExports module¶
Classes used to import/export data to mongoDB
-
mongoUtils.importsExports.
import_workbook
(workbook, db, fields=None, ws_options={'dt_python': True}, stats_every=1000)[source]¶ save all workbook’s sheets to a db consider using
ImportXls
class instead which is more flexible but imports only a single sheetParameters: see
ImportXls
classExample: >>> from pymongo import MongoClient >>> from mongoUtils import _PATH_TO_DATA >>> db = MongoClient().test >>> res = import_workbook(_PATH_TO_DATA + "example_workbook.xlsx", db) >>> res [{'rows': 368, 'db': 'test', 'collection': 'weather'}, {'rows': 1007, 'db': 'test', 'collection': 'locations'}]
-
class
mongoUtils.importsExports.
Import
(collection, drop_collection=True, stats_every=10000)[source]¶ Bases:
object
generic class for importing into a mongoDB collection, successors should use/extend this class
Parameters: db: a pynongo database object that will be used for output
collection: a pymongo collection object that will be used for output
- drop_collection: (defaults to True)
- True drops output collection on init before writing to it
- False appends to output collection
stats_every: int print import stats every stats_every rows or 0 to cancel stats (defaults to 10000)
-
format_stats
= '|{db:16s}|{collection:16s}|{rows:15,d}|'¶
-
format_stats_header
= '...................................................\n| db | collection | rows |\n...................................................'¶
-
class
mongoUtils.importsExports.
ImportXls
(workbook, sheet, db, coll_name=None, row_start=None, row_end=None, fields=True, ws_options={'negatives_to_0': False, 'dt_python': True, 'integers_only': False}, stats_every=10000, drop_collection=True)[source]¶ Bases:
mongoUtils.importsExports.Import
save an an xls sheet to a collection see
Parameters: workbook: path to a workbook or an xlrd workbook object
sheet: name of a work sheet in workbook or an int (sheet number in workbook)
db: a pymongo database object
coll_name: str output collection name or None to create name from sheet name (defaults to None)
row_start: int or None starting raw or None to start from first row (defaults to None)
row_end:int or None ending raw or None to end at lastrow (defaults to None)
- fields:
a list with field names
or True (to treat first row as field names)
or None (for auto creating field names i.e: [fld_1, fld_2, etc]
- or a function that:
takes one argument (a list of row values)
returns a dict (if this dict contains a key ‘_id’ this value will be used for _id)
>>> lambda x: {'coordinates': [x[0] , x[1]]}
- ws_options: (optional) a dictionary specifying how to treat cell values
- dt_python : bool convert dates to python datetime
- integers_only : round float values to int helpful coz all int values are represented as floats in sheets
- negatives_to_0 : treat all negative numbers as 0’s
- drop_collection: (defaults to True)
- True drops output collection on init before writing to it
- False appends to output collection
stats_every: int print import stats every stats_every rows or 0 to cancel stats (defaults to 10000)
drop_collection: if True drops collection on init otherwise appends to collection
Example: >>> from pymongo import MongoClient >>> from mongoUtils import _PATH_TO_DATA >>> db = MongoClient().test >>> res = ImportXls(_PATH_TO_DATA + "example_workbook.xlsx", 0, db)() >>> res {'rows': 367, 'db': u'test', 'collection': u'weather'}
-
__init__
(workbook, sheet, db, coll_name=None, row_start=None, row_end=None, fields=True, ws_options={'negatives_to_0': False, 'dt_python': True, 'integers_only': False}, stats_every=10000, drop_collection=True)[source]¶
-
ws_options
¶