DEV Community

Prince Raj
Prince Raj

Posted on

Enhancing Mongoose Reference Handling in Node.js

Working with Mongoose in Node.js is generally very straight forward but can get tricky when we need to handle complex nested structures with references across models. I recently built a function that dynamically processes these references within Mongoose schemas, handles indexed fields, and even manages nested objects and arrays efficiently. In this post, I’ll Walk you through this robust solution, which can significantly streamline your codebase and improve data integrity.

The Challenge

When creating a Mongoose schema with references to other models, it’s common to encounter complex structures where:

  1. References are deeply nested across multiple levels of your data.
  2. Arrays contain references to other models.
  3. Indexed fields must be respected to avoid duplication and enforce uniqueness.
  4. Upserts (update or insert) are needed to avoid creating duplicate documents.

Handling all of this manually can be error-prone and cumbersome. This led me to design a utility function that automatically processes references, validates against indexed fields, and dynamically handles nested structures.

The Solution

Below is the detailed implementation of this utility. It tackles the aforementioned challenges and allows you to effortlessly process complex structures when inserting or updating documents.

Core Utility Functions

These functions are the backbone of our solution. They help identify whether a field is a reference, detect indexed fields, and handle different data types (ObjectId, arrays, nested objects, etc.).


function isSchemaTypeArray(schemaType: any): schemaType is Schema.Types.Array {
  return schemaType instanceof Schema.Types.Array;
}

function isSchemaTypeObjectId(
  schemaType: any
): schemaType is Schema.Types.ObjectId {
  return schemaType instanceof Schema.Types.ObjectId;
}

function isRefField(schemaType: {
  options: any;
  caster: { options: any };
}): boolean {
  if (
    isSchemaTypeObjectId(schemaType) &&
    schemaType.options &&
    typeof schemaType.options.ref === "string"
  ) {
    return true;
  }

  if (
    isSchemaTypeArray(schemaType) &&
    isSchemaTypeObjectId(schemaType.caster) &&
    schemaType.caster.options &&
    typeof schemaType.caster.options.ref === "string"
  ) {
    return true;
  }

  return false;
}
Enter fullscreen mode Exit fullscreen mode

Utility function to get reference fields and indexed fields from a schema


export function getReferenceFields(schema: Schema): string[] {
  const refFields: string[] = [];
  schema.eachPath((path, type) => {
    if (isRefField(type)) {
      refFields.push(path);
    }
  });
  return refFields;
}


export function getIndexedFields(schema: Schema): string[] {
  const indexes = schema.indexes();
  const indexedFields: string[] = [];

  indexes.forEach((index: any) => {
    const fields = Object.keys(index[0]);
    indexedFields.push(...fields);
  });

  return indexedFields;
}
Enter fullscreen mode Exit fullscreen mode

The Main Function: processReferences

This function takes the incoming data (e.g., req.body), the Mongoose schema, and a list of models. It recursively processes references, handles nested structures, and respects indexing rules to avoid duplicates.


interface ModelSchemaEntry {
  model: Model<any>;
  schema: Schema;
}

export async function processReferences(
  data: any,
  schema: Schema,
  models: { [key: string]: ModelSchemaEntry },
  userId: string | undefined = undefined
): Promise<any> {
  const referenceFields = getReferenceFields(schema);

  for (const field of referenceFields) {
    const modelEntry = models[field];
    if (!modelEntry) {
      console.error(`No model found for field: ${field}`);
      continue;
    }
    const { model, schema: fieldSchema } = modelEntry;

    if (Array.isArray(data[field])) {
      const indexedFields = getIndexedFields(fieldSchema);
      for (let i = 0; i < data[field].length; i++) {
        const item = data[field][i];
        if (typeof item === "object" && !item._id) {
          await processNestedReferences(item, fieldSchema, models);

          const filter: any = {};
          indexedFields.forEach((indexField) => {
            if (item[indexField] !== undefined) {
              filter[indexField] = item[indexField];
            }
          });

          try {
            const existingDoc = await model.findOne(filter);
            if (existingDoc) {
              data[field][i] = existingDoc._id;
            } else {
              const newDoc = await model.create(item);
              data[field][i] = newDoc._id;
            }
          } catch (error) {
            console.error(
              `Error processing item in array for field ${field}:`,
              error
            );
          }
        } else if (typeof item === "string") {
          continue;
        }
      }

      const bulkOps = data[field].map((item: any) => ({
        updateOne: {
          filter: { _id: item },
          update: { $set: { _id: item } },
          upsert: true,
        },
      }));

      try {
        await model.bulkWrite(bulkOps);
      } catch (error) {
        console.error(`Error in bulk operation for field ${field}:`, error);
      }
    } else if (typeof data[field] === "object" && data[field] !== null) {
      await processNestedReferences(data[field], fieldSchema, models, userId);

      const indexedFields = getIndexedFields(fieldSchema);
      const query: any = {};

      indexedFields.forEach((indexField) => {
        if (data[field][indexField] !== undefined) {
          query[indexField] = data[field][indexField];
        }
      });

      try {
        const existingDoc = await model.findOne(query);

        if (existingDoc) {
          data[field] = existingDoc._id;
        } else {
          const doc = await model.create(data[field]);
          data[field] = doc._id;
        }
      } catch (error) {
        console.error(`Error processing document for field ${field}:`, error);
      }
    } else if (typeof data[field] === "string") {
      continue;
    }
  }

  if (userId) {
    data.createdBy = userId;
  }

  return data;
}
Enter fullscreen mode Exit fullscreen mode

Recursively Processing Nested References

The processNestedReferences function handles cases where references are deeply nested, allowing you to build robust relationships across models with minimal effort.


async function processNestedReferences(
  data: any,
  schema: Schema,
  models: { [key: string]: ModelSchemaEntry },
  userId?: string
) {
  const referenceFields = getReferenceFields(schema);

  for (const field of referenceFields) {
    const modelEntry = models[field];
    if (!modelEntry) {
      console.error(`No model found for field: ${field}`);
      continue;
    }

    const { model, schema: fieldSchema } = modelEntry;

    if (Array.isArray(data[field])) {
      const indexedFields = getIndexedFields(fieldSchema);

      for (let i = 0; i < data[field].length; i++) {
        const item = data[field][i];
        if (typeof item === "object" && !item._id) {
          await processNestedReferences(item, fieldSchema, models);

          const filter: any = {};
          indexedFields.forEach((indexField) => {
            if (item[indexField] !== undefined) {
              filter[indexField] = item[indexField];
            }
          });

          try {
            const existingDoc = await model.findOne(filter);
            if (existingDoc) {
              data[field][i] = existingDoc._id;
            } else {
              const newDoc = await model.create(item);
              data[field][i] = newDoc._id;
            }
          } catch (error) {
            console.error(
              `Error processing item in array for field ${field}:`,
              error
            );
          }
        } else if (typeof item === "string") {
          continue;
        }
      }
    } else if (typeof data[field] === "object" && data[field] !== null) {
      await processNestedReferences(data[field], fieldSchema, models);

      const indexedFields = getIndexedFields(fieldSchema);
      const query: any = {};

      indexedFields.forEach((indexField) => {
        if (data[field][indexField] !== undefined) {
          query[indexField] = data[field][indexField];
        }
      });

      try {
        const existingDoc = await model.findOne(query);

        if (existingDoc) {
          data[field] = existingDoc._id;
        } else {
          const doc = await model.create(data[field]);
          data[field] = doc._id;
        }
      } catch (error) {
        console.error(`Error processing document for field ${field}:`, error);
      }
    } else if (typeof data[field] === "string") {
      continue;
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

How It All Comes Together

With this utility function, we can easily manage complex data structures with minimal effort. All we need to do is pass our req.body and models into processReferences. The function takes care of:

  1. Handling ObjectId references (including nested references).
  2. Managing arrays of objects with references.
  3. Respecting indexed fields to avoid duplicate entries.
  4. Automatically performing upserts based on indexed fields.

Conclusion

This solution has been a game-changer for me in handling complex Mongoose schemas. It abstracts away the repetitive tasks of checking for existing documents, handling nested structures, and ensuring data integrity. If you work with Mongoose frequently, I highly recommend implementing something similar in your codebase!

P.S: The earlier posted function was able to perform only Objects and ObjectId

Feel free to comment below if you have any questions or improvements.

Happy coding! ✌️

Top comments (0)