root/trunk/omnipitr/doc/omnipitr-restore.pod

Revision 183, 11.4 kB (checked in by depesz, 4 years ago)

documentation for new option (not yet implemented) for omnipitr-restore

Line 
1 =head1 OmniPITR - omnipitr-restore
2
3 =head2 USAGE
4
5 /some/path/omnipitr/bin/omnipitr-restore [options] %f %p
6
7 Options:
8
9 =over
10
11 =item --data-dir (-D)
12
13 Where PostgreSQL datadir is located (path) (defaults to current working
14 directory).
15
16 =item --source (-s)
17
18 Where I<omnipitr-restore> can find wal segments to use.
19
20 Check L<Source specification> for more details.
21
22 =item --recovery-delay (-w)
23
24 Delay when recovering wal segments (in seconds).
25
26 This is primarily used to keep window of safety before I<DELETE * FROM
27 main_table> will be applied on slave database.
28
29 =item --finish-trigger (-f)
30
31 Name of file to watch for existence - if it exists, recovery process will stop,
32 and PostgreSQL slave will become fully functional.
33
34 Check L<Finishing recovery> section for more details.
35
36 =item --remove-unneeded (-r)
37
38 Makes I<omnipitr-restore> remove unneeded wal segments. These are B<not>
39 segments that were passed to Pg - I<omnipitr-restore> checks last redo segment
40 to make sure this is safe.
41
42 =item --remove-at-a-time (-rt)
43
44 When removing old segments, remove at most that many segments before re-entering
45 loop to check for signals and/or new segment availability for Postgres.
46
47 Defaults to 3.
48
49 =item --removal-pause-trigger (-p)
50
51 Name of file to watch for existence. If it exists - I<omnipitr-restore> will not
52 remove unneeded wal segments regardless of I<--removal-unneeded> option. This is
53 to provide a way to make backups on slave.
54
55 =item --pre-removal-processing (-h)
56
57 If given, argument will be treated as shell command to run when any segment will
58 be removed from archive.
59
60 If the hook will finish without errors - segment will be removed. If there will
61 be errors - removal procedure will be postponed, and after some time, it will be
62 retried.
63
64 There will be one extra parameter attached, which will be name of the segment
65 file to be processed (prepared in such a way that it will be relative to current
66 working directory).
67
68 Passed segment will always be uncompressed.
69
70 =item --temp-dir (-t)
71
72 Where to create temporary files (defaults to /tmp or I<$TMPDIR> environment
73 variable location). This is only used when using pre-removal-processing.
74
75 =item --log (-l)
76
77 Name of logfile (actually template, as it supports %% L<strftime(3)>
78 markers. Unfortunately due to the %x usage by PostgreSQL, We cannot use %%
79 macros directly. Instead - any occurence of ^ character in log dir will be first
80 changed to %, and later on passed to strftime.
81
82 =item --pid-file
83
84 Name of file to use for pidfile. If it is specified, than only one copy of
85 I<omnipitr-restore> (with this pidfile) can run at the same time.
86
87 Trying to run second copy of I<omnipitr-restore> will result in an error.
88
89 =item --verbose (-v)
90
91 Log verbosely what is happening.
92
93 =item --gzip-path (-gp)
94
95 Full path to gzip program - in case you can't set proper PATH environment
96 variable.
97
98 =item --bzip2-path (-bp)
99
100 Full path to bzip2 program - in case you can't set proper PATH environment
101 variable.
102
103 =item --lzma-path (-lp)
104
105 Full path to lzma program - in case you can't set proper PATH environment
106 variable.
107
108 =item --pgcontroldata-path (-pp)
109
110 Full path to pg_controldata program - in case you can't set proper PATH
111 environment variable.
112
113 =item --error-pgcontroldata (-ep)
114
115 Sets handler for errors with pgcontroldata. Possible options:
116
117 =over
118
119 =item * ignore - warn in logs, but nothing else to be done - after some time,
120 recheck if pg_controldata works
121
122 =item * hang - enter infinite loop, waiting for admin interaction, but not
123 finishing recovery
124
125 =item * break - breaks recovery, and returns error status to PostgreSQL
126 (default)
127
128 =back
129
130 Please check L<ERRORS> section below for more details.
131
132 =back
133
134 =head2 DESCRIPTION
135
136 Call to I<omnipitr-restore> should be in I<restore_command> variable in
137 I<recovery.conf>.
138
139 Which options should be given depends only on installation, but generally you
140 will need at least:
141
142 =over
143
144 =item * --data-dir
145
146 PostgreSQL "%p" passed file path is relative to I<DATADIR>, so it is required to
147 know it.
148
149 =item * --log
150
151 to make sure that information is logged someplace about archiving progress
152
153 =item * --source
154
155 to specify where to load WAL segments from
156
157 =back
158
159 If you'll specify more than 1 destination, you will also need to specify
160 I<--state-dir>
161
162 Of couse you can provide many --dst-local or many --dst-remote or many mix of
163 these.
164
165 Generally omnipitr-restore will try to deliver WAL segment to all destinations,
166 and will fail if B<any> of them will not accept new segment.
167
168 Segments will be transferred to destinations in this order:
169
170 =over
171
172 =item 1. All B<local> destinations, in order provided in command line
173
174 =item 2. All B<remote> destinations, in order provided in command line
175
176 =back
177
178 In case any destination will fail, I<omnipitr-restore> will save state (which
179 destinations it delivered the file to) and return error to PostgreSQL - which
180 will cause PostgrerSQL to call I<omnipitr-restore> again for the same WAL
181 segment after some time.
182
183 State directory will be cleared after every successfull file send, so it should
184 stay small in size (expect 1 file of under 500 bytes).
185
186 When constructing command line to put in I<restore_command> PostgreSQL GUC,
187 please remember that while providing C<"%p" "%f"> will work, I<omnipitr-restore>
188 requires only "%p"
189
190 =head3 Source specification
191
192 If the wal segments are compressed you have to prefix source path with
193 compression type followed by '=' sign.
194
195 Allowed compression types:
196
197 =over
198
199 =item * gzip
200
201 Decompresses with gzip program, used file extension is .gz
202
203 =item * bzip2
204
205 Decompresses with bzip2 program, used file extension is .bz2
206
207 =item * lzma
208
209 Decompresses with lzma program, used file extension is .lzma
210
211 =back
212
213 If you want to pass any extra arguments to compression program, you can either:
214
215 =over
216
217 =item * make a wrapper
218
219 Write a program/script that will be named in the same way your actual
220 compression program is named, but adding some parameters to call
221
222 =item * use environment variables
223
224 All of supported compression programs use environment variables:
225
226 =over
227
228 =item * gzip - GZIP
229
230 =item * bzip2 - BZIP2
231
232 =item * lzma - XZ_OPT
233
234 =back
235
236 For details - please consult manual to your choosen compression tool.
237
238 =back
239
240 =head3 Finishing recovery
241
242 There are 2 ways I<omnipitr-restore> can finish recovery, and there are 2
243 separate ways to signal it that it should finish.
244
245 First, the finishing procedures:
246
247 =over
248
249 =item * smart
250
251 In this mode I<omnipitr-restore> will feed all available WAL segments to
252 PostgreSQL (without any I<--recovery-delay> induced delay), and then finish
253 restoration process.
254
255 =item * immediate
256
257 In this mode I<omnipitr-restore> will skip all pending WAL segments, and make
258 PostgreSQL finish recover immediately.
259
260 This can be useful in case of running really bad query (think: TRUNCATE users),
261 and wanting to prevent this change to be replicated to slave.
262
263 =back
264
265 Now. I<omnipitr-restore> can be signaled into finishing recovery in 2 ways, one
266 of which is optional.
267
268 =over
269
270 =item * trigger file
271
272 This one is optional. If you will use --finish-trigger switch,
273 I<omnipitr-restore> will look for this file, and if it exists - it will start
274 finishing.
275
276 If the file exists, and contains string "NOW" (without quotation characters, but
277 with optional new line character "\n"), I<omnipitr-restore> will enter
278 "immediate finish" procedure. If the content is different, or the file is empty
279 - it will proceed in smart finish mode.
280
281 After OmniPITR will finish recovery, and PostgreSQL will enter normal mode of
282 working, it's strongly advised to remove this file.
283
284 =item * system signal
285
286 This one works always, regardless of --finish-trigger switch. Generally you can
287 send system signals (kill) to I<omnipitr-restore> to make it go to finish
288 recovery procedure.
289
290 Only 1 signals are supported:
291
292 =over
293
294 =item * SIGUSR1
295
296 makes the finish I<immediate>
297
298 =back
299
300 It is currently not possible to forcing 'smart' finishing by signal, due to
301 the fact that L<omnipitr-restore> is restarted after every segment.
302
303 =back
304
305 =head3 Segment removal
306
307 omnipitr-restore will automatically remove segments that are no longer
308 necessary.
309
310 To make it happen, it will periodically run I<pg_controldata> program, and check
311 name of last segment required for redo.
312
313 If pre-removal-processing is defined, it will be called before actuall removal.
314
315 omnipitr-restore will remove segments chronologically - oldest segments first.
316
317 One useful idea for pre-removal-processing, is using L<omnipitr-archive> for
318 processing - to send xlog segments to some permanent storage place.
319
320 =head2 ERRORS
321
322 =head3 pg_controldata
323
324 Sometimes, for not yet known reason, pg_controldata fails (doesn't print
325 anything, exits with status -1).
326
327 In such situation, omnipitr-restore died too, with error, but it had one bad
328 consequence: it made PostreSQL assume that it should stop recovery and go
329 directly into standalone mode, and start accepting connections.
330
331 Because this solution is not so good, there is now switch to change the
332 behaviour in case of such error.
333
334 It has 3 possible values.
335
336 =head4 break
337
338 "Break" means break recovery, and return to PostgreSQL error. This is default
339 behaviour to make it backward compatible.
340
341 =head4 ignore
342
343 With this error handling, omnipitr-restore will simply ignore all errors from
344 pg_controldata - i.e. it will simply print information about error to logs, but
345 it will not change anythin - it will still try to work on next wal segments, and
346 after 5 minutes - it will retry to get pg_controldata.
347
348 This is the most transparent setting, but also kind of dangerous. It means that
349 if there is non-permanent problem leading to pg_controldata failing not 100%, it
350 might simply get overlooked - replication will work just fine.
351
352 And this can mean that the real source of the error can propagate and do more
353 harm.
354
355 =head4 hang
356
357 With "hang" error handling, in case of pg_controldata failure, omnipitr-restore
358 will basically hang - i.e. enter infinite loop, not doing anything.
359
360 Of course before entering infinite loop, it will log information about the
361 problem.
362
363 While it might seem like pointless, it has two benefits:
364
365 =over
366
367 =item * It will not cause slave server to become standalone
368
369 =item * It is easily detactable, as long as you're checking size of wal archive
370 directory, or wal replication lag, or any other metric about replication - as
371 replication will be 100% stoppped.
372
373 =back
374
375 To recover from hanged recovery, you have to restart PostgreSQL, i.e. run this
376 sequence of commands (or something similar depending on what you're using to
377 start/stop your PostgreSQL):
378
379     pg_ctl -D /path/to/pgdata -m fast stop
380     pg_ctl -D /path/to/pgdata start
381
382 Of course when next pg_controldata error will happen - it will hang again.
383
384 =head2 EXAMPLES
385
386 =head3 Minimal setup:
387
388     restore_command='/.../omnipitr-restore -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ %f %p'
389
390 =head3 Minimal setup, but with defined finish trigger and recovery delay (5
391 mintues):
392
393     restore_command='/.../omnipitr-restore -D /mnt/data/ -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -w 300 -f /tmp/finish.trigger %f %p'
394
395 =head3 Setup as above, but with pause trigger defined for doing backups-on-slave and removing unneeded segments
396
397     restore_command='/.../omnipitr-restore -D /mnt/data/ -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -w 300 -f /tmp/finish.trigger -r -p /tmp/pause.trigger %f %p'
398
399 =head3 Minimal setup, but with backing up segments to remote server:
400
401     restore_command='/.../omnipitr-restore -l /var/log/omnipitr/restore.log -s /mnt/wal_restore/ -h "/.../omnipitr-archive --force-data-dir -l /var/log/omnipitr/archive.log -dr bzip2=rsync://backup/postgres/xlogs/" %f %p'
402
403 =head2 COPYRIGHT
404
405 The OmniPITR project is Copyright (c) 2009-2010 OmniTI. All rights reserved.
406
Note: See TracBrowser for help on using the browser.