GraphLookup in Mongodb

GraphLookup in Mongodb

Graph lookup is a way of doing recursion in Mongodb. It does a recursion to find the recursive outcome of any given condition within a single collection or across collections.

Syntax:

{
   $graphLookup: {
      from: <collection name>,
      startWith: <field name>,
      connectFromField: <field name>,
      connectToField: <field name>,
      as: <new field name to store the recursion result>,
      maxDepth: <maximum recursion depth>,  // Optional
      depthField: <string>,  // Optional
      restrictSearchWithMatch: <document>  // Optional
  }
}

Parameter explanation:

  • from - Collection name to perform graph lookup ie. recursive search

  • startWith - It is a local field that indicates that the recursion starts here for each input document. It matches the ‘connectToField’ in ‘from’ collection.

  • connectFromField - Graphlookup takes up connectFromField in ‘from’ collection and matches it with the connectToField in ‘from’ collection for each document. Thereby recursion goes on.

  • connectToField: The field to be connected.

  • as:Takes a new field name that stores the response for each document as an array.

  • maxDepth: Maximum depth of the recursion to take place. It is an optional parameter.

  • depthField: Takes a new field name that stores the depth of recursion for each document. The depth value starts at Zero.

  • restrictSearchWithMatch: Additional recursive search conditions.

Considerations:

  • Sharded Collections:

    • Sharding is the method of storing the data among different databases or machines when the data is huge. In graph lookup, in the ‘from’ parameter we can specify the sharded collections also.
  • Max Depth:

    • maxDepth parameter in the $graphLookup stage indicates the maximum depth of recursion to take place for the specified query. On setting this parameter to ‘0’, it indicates that to not do a recursive search for the specified query.

Memory:

  • GraphLookup memory limit should be 100 megabytes.

  • aggregate() operation tasks allowDiskUse parameter as its input.

  • If it is { allowDiskUse: true } - It allows the pipeline stage which requires more than 100 megabytes of memory to write temporary files to disk.

  • If it is { allowDiskUse: false} - The pipeline stage which requires more than 100 megabytes of memory raises an error.

  • Even if the aggregate() operation has { allowDiskUse: true }, GraphLookup ignores the option. Only other stages in the pipeline have its effect.

Views and Collation:

  • If we are doing aggregation across multiple views then it should have the same collation.

  • Collation is nothing but it allows users to specify language-specific rules for string comparisons.

Demonstration:

Consider two collections - employees and managers.

Schema diagram:

schemadia_2

  • In employees collection - managers field indicates the manager associated with the particular employee. It contains an array of managerId. ie._id in ‘managers’ collection.

  • In managers collection - senior_managers field indicates the senior managers for the particular manager. It holds an array of manager id ie._id in the ‘managers’ collection.

Managers collection:

managers

Employees Collection:

employee

Structure of employees and managers in my database:

Employees Example:

employeediag

Managers Example:

managerdiag

Scenarios:

  • Case 1: Within the collection

  • Case 2: Across collections

Case 1: Within the collection

To find the senior managers associated with any particular manager, we have to do a recursion within the ‘managers’ collection.

Note: Run the below graphlookup query on the ‘managers’ collection

Example: For Jack, his senior managers are Tom, Laura, Steve, Mary and Chris - 5 documents as depicted in the 'Managers Example' picture.

{
   $graphLookup: {
      from: 'managers',
      startWith: '$senior_managers',
      connectFromField: 'senior_managers',
      connectToField: '_id',
      as: 'AllParentManagers'
    }
}

Output: For each manager, In the ‘AllParentManagers’ field it displays the corresponding senior managers.

case1output_2

Case 2: Across collections

To find the managers associated with any particular employee, we have to do a recursion across the ‘employees’ and ‘managers’ collection.

Note: Run the below graphlookup query on employee collection

Example: For Alex, his managers are Jack, Julie, Tom, Steve, Laura, Mary, Chris, Sarah, Matt - 9 documents.

{ 
   $graphLookup: {
      from: 'managers',
      startWith: '$managers',
      connectFromField: 'senior_managers',
      connectToField: '_id',
      as: 'AllParentManagers',
    }
}

Output: For each employee, In the ‘AllParentManagers’ field it displays the corresponding managers and their senior managers.

case2output_2

--------------------- THANKS FOR READING😇 ---------------------