Removing duplicate objects from an Array


Removing duplicate objects from an Array (is hard)

Let’s say we have an array of objects such as the following:

const books = [
  {
    name: '西游记',
    author: '吴承恩',
  },
  {
    name: '红楼梦',
    author: '曹雪芹',
  },
  {
    name: '西游记',
    author: '吴承恩',
  },
];

The first and the last objects in the array are identical. So what if we want to remove such duplicate objects from the array? Surprisingly, this is quite a difficult problem to solve. To understand why, let’s look at how we could remove duplicates from an Array of flat items, such as strings.

Removing duplicate flat items from an Array (is easy)

Let’s say we had an Array of strings such as the following:

const strings = ['西游记', '红楼梦', '西游记'];

If we wanted to remove any duplicates from this array, we could use the filter() method along with the indexOf() method to check if any given item is a duplicate.

const filteredStrings = strings.filter((item, index) => {
  // Return to new array if the index of the current item is the same
  // as the first occurence of the item
  return strings.indexOf(item) === index;
});

Since strings.indexOf(item) will always return the index of the first occurrence of the item, we can tell if the current item within the filter loop is a duplicate. If it is, we don’t return it to the new array created by the filter() method

const uniq = (array) => [...new Set(array)];
console.log(uniq(strings)); // [ '西游记', '红楼梦' ]
console.log(uniq(books)); //无法对对象去重

Objects don’t work the same way

The reason this same method doesn’t work with objects is because any 2 objects that have identical properties and values are not actually considered identical.

const a = {
  name: '西游记',
  author: '吴承恩',
};
const b = {
  name: '西游记',
  author: '吴承恩',
};

a === b; // false

This is because objects are compared based on reference rather than structure. The fact that the two objects have the same orperties and values is not considered when comparing the two. Therefore, the indexOf(object) within an array of objects will always return the index of the exact object passed, even if there exists another object with the exact same properties and values.

My solution

Given this information, the only way to check if two objects have the same properties and values is to actually check the properties and values of each object. The solution I came up with involved doing this manual check, but with some enhacements to improve performance and reduce unnecessary nested loops.

In particular, I did 3 things -

  1. Only check each item in the array against every other item that comes after it in order to avoid comparing the same objects more than once
  2. Only check items that have not been found to be a duplicate of any other item
  3. Before checking if the values of each property are the same, check that both objects have the same keys
  1. 为了避免多次比较同一对象,请仅将数组中的每个项目与其后的其他所有项目进行检查
  2. 仅检查未发现与其他项目重复的项目
  3. 在检查每个属性的值是否相同之前,请检查两个对象具有相同的键

Here’s the final function:

function removeDuplicates(arr) {
  const result = [];
  const duplicatesIndices = [];

  // Loop through each item in the original array
  arr.forEach((current, index) => {
    if (duplicatesIndices.includes(index)) return;

    result.push(current);

    // Loop through each other item on array after the current one
    for (let comparisonIndex = index + 1; comparisonIndex < arr.length; comparisonIndex++) {
      const comparison = arr[comparisonIndex];
      const currentKeys = Object.keys(current);
      const comparisonKeys = Object.keys(comparison);

      // Check number of keys in objects
      if (currentKeys.length !== comparisonKeys.length) continue;

      // Check key names
      const currentKeysString = currentKeys.sort().join('').toLowerCase();
      const comparisonKeysString = comparisonKeys.sort().join('').toLowerCase();
      if (currentKeysString !== comparisonKeysString) continue;

      // Check values
      let valuesEqual = true;
      for (let i = 0; i < currentKeys.length; i++) {
        const key = currentKeys[i];
        if (current[key] !== comparison[key]) {
          valuesEqual = false;
          break;
        }
      }
      if (valuesEqual) duplicatesIndices.push(comparisonIndex);
    } // end for loop
  }); // end arr.forEach()

  return result;
}
console.log(removeDuplicates(books)); //可以对对象去重
// [ { name: '西游记', author: '吴承恩' }, { name: '红楼梦', author: '曹雪芹' } ]

https://stackblitz.com/edit/remove-duplicates-arrayofobjects


文章作者: Ire Aderinokun
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Ire Aderinokun !
  目录