Discussion:
pg_Restore
(too old to reply)
bhanu udaya
2013-01-21 04:10:08 UTC
Permalink
Hello,

Greetings !
I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.
I am trying to restore 9.5G database (1GB dumpfile) which has 500 schemas with 1 lakh rows in each schema. Could take the data dump using pg_dump and it takes around 40 minutes. I tried to use pg_restore to restore this dump, but it takes hours to restore the dump. I have used the configurations parameters as below:

shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB

I have used command to restore as below:
pg_Restore -d newdb -j2 e:\file.dmp

My machine configurations are as below:
Windows Core I5 with 4GB Ram.

Thanks for your help.

Thanks and Regards
Radha Krishna
François Beausoleil
2013-01-21 04:19:44 UTC
Permalink
Post by bhanu udaya
I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.
Welcome, and good choice for the version.
Post by bhanu udaya
shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
pg_Restore -d newdb -j2 e:\file.dmp
Windows Core I5 with 4GB Ram.
Other settings you can change during the initial restore / load phase:

fsync = off # ONLY DURING INITIAL DATA LOAD!

checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9

wal_level = minimal # You'll need to do a full base backup if you use this

Read this section of the manual: http://www.postgresql.org/docs/current/static/runtime-config-wal.html

Have a great day!
François Beausoleil
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
bhanu udaya
2013-01-21 06:17:35 UTC
Permalink
Hello,Greetings !Thank you for the prompt reply. I have changed the settings as listed below:> > shared_buffers = 1024MB
Post by François Beausoleil
Post by bhanu udaya
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB> fsync = off # ONLY DURING INITIAL DATA LOAD!
checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9
wal_level = minimal # You'll need to do a full base backup if you use this
But, have same problem. It is almost 1 hour now, the restoration is still going on. After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email. Kindly help , as how to speed up this restoration process. Thanks and RegardsRadha Krishna > Subject: Re: [GENERAL] pg_Restore
Post by François Beausoleil
Date: Sun, 20 Jan 2013 23:19:44 -0500
Post by bhanu udaya
I am new to postgres and recently got migrated from oracle. I am using postgresql 9.2 version.
Welcome, and good choice for the version.
Post by bhanu udaya
shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
pg_Restore -d newdb -j2 e:\file.dmp
Windows Core I5 with 4GB Ram.
fsync = off # ONLY DURING INITIAL DATA LOAD!
checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9
wal_level = minimal # You'll need to do a full base backup if you use this
Read this section of the manual: http://www.postgresql.org/docs/current/static/runtime-config-wal.html
Have a great day!
François Beausoleil
Magnus Hagander
2013-01-21 07:15:47 UTC
Permalink
Post by bhanu udaya
Hello,
Greetings !
Thank you for the prompt reply. I have changed the settings as listed
Post by François Beausoleil
Post by bhanu udaya
shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
fsync = off # ONLY DURING INITIAL DATA LOAD!
checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per
file, check disk space)
Post by bhanu udaya
Post by François Beausoleil
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9
wal_level = minimal # You'll need to do a full base backup if you use this
But, have same problem. It is almost 1 hour now, the restoration is still
going on. After every test case execution, we would like to refresh the
database and expected refresh should be completed in less than 10 minutes.
Is this achievable with the kind of configuration I have listed in my
earlier email.
Post by bhanu udaya
Kindly help , as how to speed up this restoration process.
Try running pg_restore with the -1 option. If that doesn't help, try -m4 or
something like that (you'll have to remove the first option then, can't use
both at once)

But it's going to be pushing it anyway. Your scenario is going to create
thousands of files (assuming you have multiple tables in each of your
schemas as is normal), and that's just not something ntfs does very fast.
Once the files are there, I bet loading the data is reasonably fast since
it can't be all that big....

/Magnus
bhanu udaya
2013-01-21 09:31:04 UTC
Permalink
Hello,Greetings !I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB database. This much time can not be affordable as the execution of test cases take only 10% of this whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team. Kindly let me know if we can reduce the database restoration time . Thanks and RegardsRadha Krishna
Date: Mon, 21 Jan 2013 08:15:47 +0100
Subject: Re: [GENERAL] pg_Restore
Post by bhanu udaya
Hello,
Greetings !
Post by François Beausoleil
Post by bhanu udaya
shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
fsync = off # ONLY DURING INITIAL DATA LOAD!
checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per file, check disk space)
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9
wal_level = minimal # You'll need to do a full base backup if you use this
But, have same problem. It is almost 1 hour now, the restoration is still going on. After every test case execution, we would like to refresh the database and expected refresh should be completed in less than 10 minutes. Is this achievable with the kind of configuration I have listed in my earlier email.
Kindly help , as how to speed up this restoration process.
Try running pg_restore with the -1 option. If that doesn't help, try -m4 or something like that (you'll have to remove the first option then, can't use both at once)
But it's going to be pushing it anyway. Your scenario is going to create thousands of files (assuming you have multiple tables in each of your schemas as is normal), and that's just not something ntfs does very fast. Once the files are there, I bet loading the data is reasonably fast since it can't be all that big....

/Magnus
Raghavendra
2013-01-21 10:11:18 UTC
Permalink
Post by bhanu udaya
Hello,
Greetings !
I tried with all the below options. It approximatly takes 1 hour 30
minutes for restoring a 9GB database. This much time can not be affordable
as the execution of test cases take only 10% of this whole time and waiting
1 hour 30 minutes after every test case execution is alot for the
team. Kindly let me know if we can reduce the database restoration time .
On linux, below settings work well using using -j option of pg_restore.
Since its windows, give another try with below option.(as already best
suggested in this email).

shared_buffers= 1024MB
work_mem= 512MB
maintenance_work_mem = 1GB
checkpoint_segments=(in between 128 - 256)
checkpoint_timeout=(default is 15mns make to 1h)
autovacuum=off
track_counts=off
fsync=off
full_page_writes=off
synchronous_commit=off
bgwriter_delay=(default 200ms, change to 50ms)

---
Regards,
Raghavendra
EnterpriseDB Corporation
Blog: http://raghavt.blogspot.com/
Post by bhanu udaya
Thanks and Regards
Radha Krishna
------------------------------
Date: Mon, 21 Jan 2013 08:15:47 +0100
Subject: Re: [GENERAL] pg_Restore
Post by bhanu udaya
Hello,
Greetings !
Thank you for the prompt reply. I have changed the settings as listed
Post by François Beausoleil
Post by bhanu udaya
shared_buffers = 1024MB
work_mem = 512MB
maintenance_work_mem = 512MB
wal_buffers = 100MB
fsync = off # ONLY DURING INITIAL DATA LOAD!
checkpoint_segments = 128 # large value, such as 128 or 256 (16MB per
file, check disk space)
Post by bhanu udaya
Post by François Beausoleil
checkpoint_timeout = 30min
checkpoint_completion_target = 0.9
wal_level = minimal # You'll need to do a full base backup if you use
this
Post by bhanu udaya
But, have same problem. It is almost 1 hour now, the restoration is
still going on. After every test case execution, we would like to refresh
the database and expected refresh should be completed in less than 10
minutes. Is this achievable with the kind of configuration I have listed in
my earlier email.
Post by bhanu udaya
Kindly help , as how to speed up this restoration process.
Try running pg_restore with the -1 option. If that doesn't help, try -m4
or something like that (you'll have to remove the first option then, can't
use both at once)
But it's going to be pushing it anyway. Your scenario is going to create
thousands of files (assuming you have multiple tables in each of your
schemas as is normal), and that's just not something ntfs does very fast.
Once the files are there, I bet loading the data is reasonably fast since
it can't be all that big....
/Magnus
Albe Laurenz
2013-01-21 11:39:00 UTC
Permalink
Post by bhanu udaya
I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB
database. This much time can not be affordable as the execution of test cases take only 10% of this
whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team. Kindly
let me know if we can reduce the database restoration time .
I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?

Maybe that is faster.

Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
dinesh kumar
2013-01-21 12:15:35 UTC
Permalink
Hi Bhanu,

Yes, below is the faster approach to follow.

I don't know if that helps, but have you tried creating a template database
Post by Albe Laurenz
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?
Maybe that is faster.
If you are trying to take the dump from one cluster and restoring it in
another cluster, then make sure your pg_restore use parallel option "-j"
and also follow the parameters what Raghav said and tune WAL_BUFFERS to
some 32 to 64 MB value. And also if possible, keep your dump file into
another partition than the PGDATA which can improve the I/O balance.

Thanks.

Best Regards,
Dinesh
manojadinesh.blogspot.com
Chris Travers
2013-01-21 12:16:19 UTC
Permalink
Post by bhanu udaya
Post by bhanu udaya
I tried with all the below options. It approximatly takes 1 hour 30
minutes for restoring a 9GB
Post by bhanu udaya
database. This much time can not be affordable as the execution of test
cases take only 10% of this
Post by bhanu udaya
whole time and waiting 1 hour 30 minutes after every test case execution
is alot for the team. Kindly
Post by bhanu udaya
let me know if we can reduce the database restoration time .
I don't know if that helps, but have you tried creating a template database
and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;
instead of restoring a dump every time?
Also for test cases, my preferred way is to put every test case in a
transaction that cannot commit. This way the tests are safe to run on
production environments. See pgTAP for one possibility here if you are
testing stored procedures. (Application code we run through wrappers that
filter out commits.)

But also the template approach is a good one fi you cannot guarantee that
the tests always role back.

Best Wishes,
Chris Travers
bhanu udaya
2013-01-21 15:10:05 UTC
Permalink
Hello,Thanks alot for all your replies. I tried all settings suggested, it did not work. pg_restore is very slow. It does not come out less than 1 1/2 hour. Can you please let me know the procedure for Template. Will it restore the data also . Please update. I need the database (metadata + data) to be restored back after certain Java executions. Ex:- I have 9.5 gb database (with 500 schemas + data). This is treated as base database and it needs to be restored every time after certain transactions. Thanks for your reply. Thanks and REgardsRadha Krishna
Date: Mon, 21 Jan 2013 04:16:19 -0800
Subject: Re: [GENERAL] pg_Restore
Post by bhanu udaya
I tried with all the below options. It approximatly takes 1 hour 30 minutes for restoring a 9GB
database. This much time can not be affordable as the execution of test cases take only 10% of this
whole time and waiting 1 hour 30 minutes after every test case execution is alot for the team. Kindly
let me know if we can reduce the database restoration time .
I don't know if that helps, but have you tried creating a template database

and doing DROP DATABASE xxx; CREATE DATABASE xxx TEMPLATE mytemplate;

instead of restoring a dump every time?

Also for test cases, my preferred way is to put every test case in a transaction that cannot commit. This way the tests are safe to run on production environments. See pgTAP for one possibility here if you are testing stored procedures. (Application code we run through wrappers that filter out commits.)

But also the template approach is a good one fi you cannot guarantee that the tests always role back.
Best Wishes,Chris Travers
Albe Laurenz
2013-01-21 15:19:45 UTC
Permalink
Post by bhanu udaya
Can you please let me know the procedure for Template. Will it restore the data also
.
It will create a complete copy of an existing database

The procedure is
CREATE DATABASE newdb TEMPLATE olddb;

Nobody may be connected to olddb for this to work.

Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
bhanu udaya
2013-01-21 16:15:32 UTC
Permalink
Hello,Thanks all for the great help. Template is very good option so far seen. It comes little quickly than pg_restore. But, this also takes 40 minutes time. I am using Windows with 4GB Ram. Thanks and RegardsRadha Krishna
Subject: RE: [GENERAL] pg_Restore
Date: Mon, 21 Jan 2013 15:19:45 +0000
Post by bhanu udaya
Can you please let me know the procedure for Template. Will it restore the data also
.
It will create a complete copy of an existing database
The procedure is
CREATE DATABASE newdb TEMPLATE olddb;
Nobody may be connected to olddb for this to work.
Yours,
Laurenz Albe
bhanu udaya
2013-01-21 16:46:00 UTC
Permalink
Hello All,
Can we achieve this template or pg_Restore in less than 20 minutes time. Any more considerations. Kindly reply. Thanks and RegardsRadha Krishna From: ***@hotmail.com
To: ***@wien.gv.at; ***@gmail.com
CC: ***@hagander.net; ***@teksol.info; pgsql-***@postgresql.org
Subject: RE: [GENERAL] pg_Restore
Date: Mon, 21 Jan 2013 21:45:32 +0530




Hello,
Thanks all for the great help. Template is very good option so far seen. It comes little quickly than pg_restore. But, this also takes 40 minutes time.

I am using Windows with 4GB Ram.

Thanks and Regards
Radha Krishna
Subject: RE: [GENERAL] pg_Restore
Date: Mon, 21 Jan 2013 15:19:45 +0000
Post by bhanu udaya
Can you please let me know the procedure for Template. Will it restore the data also
.
It will create a complete copy of an existing database
The procedure is
CREATE DATABASE newdb TEMPLATE olddb;
Nobody may be connected to olddb for this to work.
Yours,
Laurenz Albe
Adrian Klaver
2013-01-21 17:17:04 UTC
Permalink
Post by bhanu udaya
Hello All,
Can we achieve this template or pg_Restore in less than 20 minutes time.
Any more considerations. Kindly reply.
Seems to me this is where Point in Time Recovery(PITR) might be helpful.

http://www.postgresql.org/docs/9.2/static/continuous-archiving.html
Post by bhanu udaya
Thanks and Regards
--
Adrian Klaver
***@gmail.com
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Eduardo Morras
2013-01-30 11:39:15 UTC
Permalink
On Mon, 21 Jan 2013 22:16:00 +0530
Post by bhanu udaya
Hello All,
Perhaps you can use the ntfs snapshot feature.

a) Create the db with the data you want
b) Shutdown PostgreSql completly
c) Go to data folder, properties, make a recover point
d) Start PostgreSql
f) Work with it
g) Stop PostgreSql
h) Go to data folder, properties, recover the previous version
i) Go to d)

Don't know exactly the name of the ntfs property, if i remember well it's recover point, but haven't used Windows for a while.

Perhaps other folders should be set to recover.

It's easy with this "feature" shoot your own feet (yes, in plural). If something wrong happens recover from the dump.

HTH
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
C. Bensend
2013-01-30 13:12:26 UTC
Permalink
Post by Eduardo Morras
Don't know exactly the name of the ntfs property, if i remember well it's
recover point, but haven't used Windows for a while.
I think you're talking about shadow copies. :)

Benny
--
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'" -- George Carlin
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
John R Pierce
2013-01-30 17:18:50 UTC
Permalink
Post by C. Bensend
Post by Eduardo Morras
Don't know exactly the name of the ntfs property, if i remember well it's
recover point, but haven't used Windows for a while.
I think you're talking about shadow copies.
the UI and usability of NTFS shadow copies is quite frustrating. you can
create them, but you can't actually access them, they are only available
to software like backup programs that know how to use them.
--
john r pierce 37N 122W
somewhere on the middle of the left coast
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Alban Hertroys
2013-01-21 15:22:25 UTC
Permalink
Post by bhanu udaya
Can you please let me know the procedure for Template.
As they say, Google is your friend.

The basic principle is this: You create a read-only (template) version of
your sample database and use that as a template for the creation of new
ones. Of course, now you have another copy of the database stored, which
takes up another 9.5GB of disk space, but that's not much on modern systems
(oh wait, Windows & 4GB memory?...)

pg_dump would probably be quite a lot faster on a less limited system -
databases like memory and fast raid arrays for disks.
Alan Hodgson
2013-01-21 16:10:16 UTC
Permalink
Post by bhanu udaya
Hello,Thanks alot for all your replies. I tried all settings suggested, it
did not work. pg_restore is very slow. It does not come out less than 1 1/2
hour. Can you please let me know the procedure for Template. Will it
restore the data also . Please update. I need the database (metadata +
data) to be restored back after certain Java executions. Ex:- I have 9.5
gb database (with 500 schemas + data). This is treated as base database and
it needs to be restored every time after certain transactions.
Don't use pg_restore, do the backups/restorations outside of PostgreSQL:

- run on a server with a snapshot-capable volume manager, use that for quick
restores
- just try using rsync from a backup copy of the base data directory

(either of the above require PostgreSQL to not be running during the
restorations)
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Chris Ernst
2013-01-21 14:19:53 UTC
Permalink
Post by bhanu udaya
I am trying to restore 9.5G database (1GB dumpfile) which has 500
schemas with 1 lakh rows in each schema. Could take the data dump using
pg_dump and it takes around 40 minutes. I tried to use pg_restore to
restore this dump, but it takes hours to restore the dump. I have used
the configurations parameters as
....
Post by bhanu udaya
But, have same problem. It is almost 1 hour now, the restoration is
still going on. After every test case execution, we would like to
refresh the database and expected refresh should be completed in less
than 10 minutes. Is this achievable with the kind of configuration I
have listed in my earlier email.
Probably not what you want to hear, but I think This is a completely
unrealistic expectation. If it takes 40 minutes for pg_dump, I would
expect pg_restore to take at least as long and likely significantly
longer (assuming both are done on similar hardware).

pg_dump only has to read the schema(s) and data and write them to a
file. pg_restore has to read write the schema and data into a new
database *AND* re-create all of the indexes, analyze, check referential
integrity......

So if the dump is taking 40 minutes, I would expect the restore to take
somewhere in the 60-90 minute range, depending on the number of and
complexity of the indexing.

- Chris
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Kevin Grittner
2013-01-21 21:28:10 UTC
Permalink
Post by Adrian Klaver
Post by bhanu udaya
Can we achieve this template or pg_Restore in less than 20
minutes time.
Seems to me this is where Point in Time Recovery(PITR) might be
helpful.
Maybe, if the source is on a different set of drives, to reduce
contention for storage and head movement. Either way it is down to
just a straight file copy, so it is the speed of your disk system
that is the limiting factor, not anything within PostgreSQL.

The more sure way of speeding it up is to add more spindles to your
drive array and make sure you are using a good RAID controller with
battery-backed cache.

-Kevin
--
Sent via pgsql-general mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Loading...