Skip to main content

How to load huge amount of data from csv to postgre

Recently I got a defiance from a some developers regarding postgresql server is slow performance. In his opinion it is not able to fetch 5 lac record from CSV files. They show there aplication which was using (node.js+sequlize ORM). When executed it was really sucking in middle.

5 lac records are really in huge amount but postgre is alos being for high performance application. There is no question we can say that it can not able to fetch this amount of data. So I suggested him to start importing from short amount of records like top 6 then top 600 then top 6000 then 60000. Yeah he did and really it stopped after 60000 records. It failed from application, program is running, running and keep running..

So boll goes in my bucket then I started digging on database server configuration parameters like effective_cache_size, work_memory, Shared_buffers, Maximum_number_Connections, wal_buffers etc, given appropriate values as per best practices and current system resources. Then restart the server to try again.

We repeated the previous actions like insert in short amount like top 6 then top 600 then top 6000 then 60000. Yeah he did and really it stopped after 60000 records. Again it was fail from application.

Now I tried to insert from database console. I executed the below query and you will not believe it finished in less then 2 seconds with all data inserted into table.

CREATE TABLE fare
(
CompanyCode character varying(255),
LineNumber character varying(255),
CardTypeCode character varying(255),
FirstLocationCode character varying(255),
SecondLocationCode character varying(255),
FareAmount character varying(255)
)

copy fare(CompanyCode,LineNumber,CardTypeCode,FirstLocationCode,SecondLocationCode,FareAmount)
From E'C:\\master_part\\cmn0006_0103.tsv' with (format csv, delimiter E'\t')

delete from fare where CompanyCode='Company Code'

insert into "fareMaster"("companyCode","lineNumber","cardTypeCode","firstLocationCode","secondLocationCode","fareAmount")
select companycode,linenumber,cardtypecode,firstlocationcode,secondlocationcode,fareamount from fare ;




So ball is out of my court and when he started digging to application there are some minor changes he did at ORM side and application started fetching lacs in database. Now he believes how postgre can handle huge data insert simultaneously.

Comments

Popular posts from this blog

History of MySQL from AB Corp to Cloud Database

MySQL was created by a Swedish company, MySQL AB, founded by David Axmark, Allan Larsson and Michael "Monty" Widenius. Original development of MySQL by Widenius and Axmark began in 1994. The first version of MySQL appeared on 23 May 1995. Its name is a combination of "My", the name of co-founder Michael Widenius's daughter,and "SQL", the abbreviation for Structured Query Language. ·          23 May 1995 - First internal release ·          Year 1996 - Version 3 o     Simple CRUD operations o     January 1997 Windows version was released on 8 January 1998 for Windows 95 and NT o     production release 1998, from www.mysql.com ·          Year 2002 - Version 4 o     MyISAM o     unions o     Tracking o     B-trees o     s...

Configure Impersonation Authentication in IIS8 for MVC Application

Impersonation is when ASP.NET executes code in the context of an authenticated and authorized client. By default, ASP.NET does not use impersonation and instead executes all code using the same user account as the ASP.NET process, which is typically the ASPNET account. There are 5 below steps by which we can establish Impersonation configuration in our secured application environment. 1.)    Creation of Application/Proxy user where Application is hosted. 2.)    Give appropriate access to the user. 3.)    Create Database Login user on database. 4.)    Authenticate User and provide credential on IIS. 5.)    Then Configure web.config on Application.

How to add an article in Transactional Replication

If we have a set-up of Transactional Replication for Data Distribution running and wanting to add new object to replication on other server we can follow below process. To add an article In Transaction replication with PUSH Subscription