for any info/changes follow me: @nickmilon

mongoUtils.aggregation module

aggregation operations

class mongoUtils.aggregation.Aggregation(collection, pipeline=None, **kwargs)[source]

Bases: object

a helper for constructing aggregation pipelines see: aggregation framework supports all aggregation operators

Parameters:
  • collection (obj) – a pymongo collection object
  • pipeline (list) – (optional) an initial pipeline list
  • kwargs (dict) – (optional) any arguments
Returns:

an aggregation object

Example:
>>> from pymongo import MongoClient;from mongoUtils.configuration import testDbConStr  # import MongoClient
>>> db = MongoClient(testDbConStr).get_default_database()                              # get test database
>>> aggr_obj = Aggregation(db.muTest_tweets_users, allowDiskUse=True)                  # select users collection
>>> aggr_obj.help()                                                                    # ask for help
['project', 'match', 'redact', 'limit', .... ]                                         # available operators
>>> aggr_obj.match({'lang': 'en'})                                                     # match English speaking
>>> aggr_obj.group({'_id': None, "avg_followers": {"$avg": "$followers_count"}})       # get average followers
>>> print(aggr_obj.code(False))                                                        # print pipeline
[{"$match": {"lang": "en"}},{"$group": {"avg_followers":
{"$avg": "$followers_count"},"_id": null}}]
>>> next(aggr_obj())                                                                   # execute and get results
{u'avg_followers': 2943.8210227272725, u'_id': None})                                  # results
_operators = ['project', 'match', 'redact', 'limit', 'skip', 'sort', 'unwind', 'group', 'out', 'geoNear']
_frmt_str = '{}\nstage#= {:2d}, operation={}'
__init__(collection, pipeline=None, **kwargs)[source]
classmethod construct_fields(fields_list=[])[source]

a constructor for fields

classmethod construct_stats(fields_lst, _id=None, stats=['avg', 'max', 'min'], incl_count=True)[source]

a constructor helper for group statistics

Parameters:
  • fields_lst: (list) list of field names
  • stats: (list) list of statistics
  • incl_count: (Bool) includes a count if True
Example:
>>> specs_stats(['foo'])
{'max_foo': {'$max': '$foo'}, '_id': None, 'avg_foo': {'$avg': '$foo'}, 'min_foo': {'$min': '$foo'}}
pipeline

returns the pipeline (a list)

classmethod help(what='operators')[source]

returns list of available operators

add(operator, value, position=None)[source]

adds an operation at specified position in pipeline

search(operator, count=1)[source]

returns (position, operator

save(file_pathname)[source]

save pipeline list to file

remove(position)[source]

remove an element from pipeline list given its position

code(verbose=True)[source]
clear()[source]
__call__(print_n=None, **kwargs)[source]

perform the aggregation when called >>> Aggregation_object()

for kwargs see: aggregate

Parameters:
  • print_n:
    • True: will print results and will return None
    • None: will cancel result printing
    • int: will print top n documents
  • kwargs: if any of kwargs are specified override any arguments provided on instance initialization.

class mongoUtils.aggregation.AggrCounts(collection, field, match=None, sort={'count': -1}, **kwargs)[source]

Bases: mongoUtils.aggregation.Aggregation

constructs a group count aggregation pipeline based on Aggregation class

Parameters:
  • collection (obj) – a pymongo collection object
  • field (str) – field name
  • match (dict) – a query match expression, defaults to None
  • sort (dict) – a sort expression defaults to {‘count’: -1}
  • kwargs (dict) – optional arguments to pass to parent Aggregation
Example:
>>> from pymongo import MongoClient;from mongoUtils.configuration import testDbConStr  # import MongoClient
>>> db = MongoClient(testDbConStr).get_default_database()                              # get test database
>>> AggrCounts(db.muTest_tweets_users, "lang",  sort={'count': -1})(verbose=True)      # counts by language
{u'count': 352, u'_id': u'en'}
{u'count': 283, u'_id': u'ja'}
{u'count': 100, u'_id': u'es'}  ...
__init__(collection, field, match=None, sort={'count': -1}, **kwargs)[source]