Over 590 million resume leaked through open databases from Chinese companies

china hack

Leaks occurred in either ElasticSearch or MongoDB databases for the first three months of the year. In the first three months of the year, Chinese companies have leaked 590 million curricula, learnt from several security researchers.

Most resumed leaks were due to malfunctioning MongoDB database and ElasticSearch servers, which were left unpassword-exposed online or ended up online due to unintended firewall errors. In recent months, and particularly in the past weeks, we received various tips on exposed servers belonging to HR-focused Chinese companies when examined.

From small companies to professional executive hunters that expose a handful of CVs, everyone has, in one form or another, lost information about their customers. Sanyam Jain, a security investigator and a member of GDI Foundation, has brought most of these leaks to our attention.

In the last month alone, Jain found out and reported seven such cases, and only four of them were taken before the publication of that article. He found ElasticSearch with 33 million Chinese users summaries on 10 March. His discoveries include ElasticSearch.

Four days after Jain told China’s National Computer Emergency Response Team (CNCERT), the database was secured. His second finding on 13 March was an ElasticSearch server with 84.8 million CVs, which was also spotted a few days earlier. With the help of CNCERT, this server was also taken down. The third discovery Jain found on 15 March was another ElasticSearch instance that had 93 million resumes.

Jain told that “DB was unintended to be taken offline, and that I had no response from CNCERT. The fourth server saved summaries from a Chinese firm containing only nine million CVs which he found in another instance in ElasticSearch. The fifth server was Jain’s biggest finding, a 129 million resume ElasticSearch cluster. At the time of writing, this database remains online because Jain could not identify his owner.

The last two discoveries of Jain were his smallest results, too. The sixth was a server with a capacity of 180,000 abstracts and the seventh only stored 17,000 abstracts. Jain discovered this last one just hours prior to this article. Jain was not the only researcher to stumble over these databases, however.

The one security researcher Devin Stokes shared with two weeks ago was the most interesting of all the databases that leaked summaries of Chinese users. It was a server of ElasticSearch that contained 19 million Chinese resumes, all in management positions. The database was part of a company operating on the Chinese market. This piece was not called by the researchers.

In addition to abstracts, this server contained full user profiles including current positions, recent discussions among recruiters and managers, training sessions and more. In addition, a list of firms signing up for headhunting services and having employed managers was provided on the leaky server. This cursory look was conducted by both foreign companies such as Kraft Heinz and StonCor, and by many Chinese local companies such as China Aviation Power Control and Wuxi AMT Technology.

This database was fortunately saved faster than most, taking two days from the email sent to CNCERT by Stokes. Apart from Jain and Stokes, Bob Diachenko of Security Discovery is another famous data violation hunter who stumbled upon such databases.

A similarly exposed server containing CVs for 20,5 million Chinese users was found yesterday by Diachenko and the researcher is currently identifying the company which was leaked with these data and informing them. But let us also not forget the other findings from Diachenko, a MongoDB database, found in January, which has drawn more than 202 million Chinese people’s summaries.

We have 590,497,000,000 resumes leaked over the past three months by Chinese companies, a worrying sign that Chinese HR companies do not take the safety of their servers seriously. You may think that it is not very important to disclose data from a summary since summaries are inherently public documents, but the truth is not. People agree with stakeholders that the curriculum vitae will be used only for the assessment of a particular position.

When users share online curriculum vitae on their own sites, they regularly edit information that is personally recognizable in the full version of a resume-such as telephone numbers, home addresses, family and marital status, and, in some cases, ID numbers, depending on the requirements of certain HR companies.

Similarly, they believe that certain data are only available to employer, and not to the entire internet when they fill out personal information on the job portals. The rate of leakage of the CVs by Chinese HR companies and Chinese portals is not only in terms of user privacy, but also on the behalf of these firms.

Mark Funk
Mark Funk is an experienced information security specialist who works with enterprises to mature and improve their enterprise security programs. Previously, he worked as a security news reporter.