How can we use Data Science to fight corruption

Data Mining is associate analytic method designed to explore information (usually massive amounts of information – generally business or market connected – conjointly referred to as “big data”) in search of consistent patterns and/or systematic relationships between variables, then to validate the findings by applying the detected patterns…

Two centuries past, coal mining spurred the EU continent’s age. Today, data processing is refueling the info revolution caused by exploding streams of information. Using data processing techniques to profile client preferences and predict buying patterns has become common follow within the personal sector. But will data processing even be accustomed to fighting corruption? And if so, how?


Last year, Transparency International Georgia launched associate ASCII text file acquisition observance and analytics portal, that extracts information from the government’s central e-procurement web site and repackages it into easy formats. Users will currently generate profiles of acquisition transactions created by government agencies, profiles of corporations bidding for public contracts, and search mixture applied math information on government disbursement. If voters suspect law violations in electronic tender processes they will submit a web report that a Dispute Resolution Board reviews inside ten operating days.

Data mining’s potential to identify inadequacies in processes involving electoral authorities and public cash is taken even more.


The European Commission, in cooperation with Transparency International, developed ARACHNE information analytics package that cross-checks information from varied public and personal establishments and helps to spot comes prone to risks of fraud, conflict of interests or irregularities.


Researchers from the Corruption center national capital have examined huge information sets of public acquisition procedures from Europe countries, checking out abnormal patterns like exceptionally short bidding periods or unusual outcomes (e.g. no competition for the winning bid, or bids repeatedly won by a similar company). Using inferential statistics – analysis that can be done to draw conclusions beyond what the data actually is capturing – they identified corrupt behavior based on deviations from ordinary patterns.


Data mining can even be accustomed to find tax fraud and improve tax payers'compliance. In the aftermath of the Luxleaks, once a source free reams of information concerning nonpayment schemes in Luxembourg, data processing techniques utilized by big apple City’s former finance commissioner David Frankel might offer some inspiration: by “identifying people UN agency had businesses the same as others however UN agency stood out as outliers on taxes paid” the auditing team improved the efficiency of its investigations into corporations suspected of underpaying taxes.


Similarly, data processing may well be utilized to fight cash laundering: associate algorithmic rule reviewing banking information and comparison it with info concerning money criminal information points might, for instance, contribute to revealing illicit money flows, a problem that ranks high on Transparency International’s agenda.


The wealth of information which will today be gathered through remote sensing, crowd-sourced national reports, news media, census information, cellular phone activity, and social networking sites etc., combined with ancient indicators, makes for seemingly endless opportunities. Do you need to spot problems with conflict of interest and/or revolving doors? Do you need to understand what folks area unit wondering corruption in a very specific country context? Text mining techniques analyzing social media noise throughout a given amount of your time might offer you a solution.


There area unit some ways non-profits and civil society organizations will enjoy data processing on a non-profit basis. These embody events and advocating for the replication of tools and platforms, that not solely render information public, however, create it comparatively straightforward to prepare and method.


The European stronghold on the move web site as an example permits users to transfer information sets and to make customized applications, even for folks with very little expertise in information management. One of the most transparent and user-friendly initiatives at the local level is the website Checkbook NYC 2.0, which provides access to New York City government’s US$70 billion annual budget. It details the manner cash is spent, including specific information on contracts, payments, revenues, budget reports, and audits. It options associate application programming interface that lets third-parties select the info they need then use it for his or her own functions.

Data mining’s nimble and purpose-oriented character will do heaps to dispel the fog during which the general public sector operates. But additional efforts area unit required to use its potential to the total and create it offered to the widest audience.

Patriotism just not only includes rallies on roads but also you can fight with such bads with your ground works. So, why are you waiting, just get yourself enrolled with

DataTrained Full Stack Data Science Program and help our nation to fight back.

Recent Blogs