COMMAND NAME: gprecoverseg

Recovers a primary or mirror segment instance that has 
been marked as down (if mirroring is enabled).

******************************************************
Synopsis
******************************************************

gprecoverseg [-p <new_recover_host>[,...]]
             |-i <recover_config_file> 
             [-d <coordinator_data_directory>]
             [-B <batch_size>] [-b <segment_batch_size>] [--differential]
             [-F] [-a] [-q] [-s] [--no-progress] [-l <logfile_directory>]
             [--max-rate <max_rate>]


gprecoverseg -r [--replay-lag <replay_lag>] [--disable-replay-lag]


gprecoverseg -o <output_recover_config_file> 
             [-p <new_recover_host>[,...]]

gprecoverseg -? 

gprecoverseg --version

******************************************************
DESCRIPTION
******************************************************

In a system with mirrors enabled, the gprecoverseg utility reactivates 
a failed segment instance and identifies the changed database files 
that require resynchronization. Once gprecoverseg completes this 
process, the system goes into resyncronizing mode until the recovered 
segment is brought up to date. The system is online and fully 
operational during resyncronization.

A segment instance can fail for several reasons, such as a host failure, 
network failure, or disk failure. When a segment instance fails, its 
status is marked as down in the Greenplum Database system catalog, 
and its mirror is activated in change tracking mode. In order to 
bring the failed segment instance back into operation again, you 
must first correct the problem that made it fail in the first place, 
and then recover the segment instance in Greenplum Database 
using gprecoverseg.

Segment recovery using gprecoverseg requires that you have an active 
mirror to recover from. For systems that do not have mirroring enabled, 
or in the event of a double fault (a primary and mirror pair both down 
at the same time) - do a system restart to bring the segments back 
online (gpstop -r).

By default, a failed segment is recovered in place, meaning that 
the system brings the segment back online on the same host and data 
directory location on which it was originally configured. In 
this case, use the following format for the recovery configuration file 
(using -i).

<failed_host_address>|<port>|<data_directory>

In some cases, this may not be possible (for example, if a host was 
physically damaged and cannot be recovered). In this situation, 
gprecoverseg allows you to recover failed segments to a completely 
new host (using -p), on an alternative data directory location on 
your remaining live segment hosts (using -s), or by supplying a 
recovery configuration file (using -i) in the following format. 
The word SPACE indicates the location of a required space. Do not 
add additional spaces.

<failed_host_address>|<port>|<data_directory>SPACE
[<recovery_host_address>|<port>|<data_directory>]

If hostname is not provided, host-address will be populated in the
configuration. In case you want to have different hostname and address,
hostname can be specified in the recovery configuration, the format
for the config file is as follows:
<failed_host_name>|<failed_host_address>|<port>|<data_directory>SPACE
[<recovery_host_name>|<recovery_host_address>|<port>|<data_directory>]

See the -i option below for details and examples of a recovery
configuration file.

The gp_segment_configuration system catalog tables can help you
determine your current segment configuration so that you can plan
your mirror recovery configuration. For example, run the following
query:

=# SELECT dbid, content, address, port, datadir 
   FROM gp_segment_configuration
   ORDER BY dbid;

The new recovery segment host must be pre-installed with the Greenplum 
Database software and configured exactly the same as the existing 
segment hosts. A spare data directory location must exist on all 
currently configured segment hosts and have enough disk space to 
accommodate the failed segments.

The recovery process marks the segment as up again in the Greenplum 
Database system catalog, and then initiates the resyncronization 
process to bring the transactional state of the segment 
up-to-date with the latest changes. The system is online 
and available during resyncronization.

If you do not have mirroring enabled or if you have both a 
primary and its mirror down, you must take manual steps to 
recover the failed segment instances and then restart the 
system, for example:

gpstop -r

******************************************************
OPTIONS
******************************************************

-a

Do not prompt the user for confirmation.


-B batch_size

The number of hosts to work on in parallel. If not specified,
the utility will start working on up to 16 hosts in parallel.
Valid values: 1-64


-b segment_batch_size

The number of segments per host to work on in parallel. If not
specified, the utility will start recovering up to 64 segments
in parallel on each host it is working on.
Valid values: 1-128


-d coordinator_data_directory

Optional. The coordinator host data directory. If not specified, 
the value set for $COORDINATOR_DATA_DIRECTORY will be used.


-F

Optional. Perform a full copy of the active segment instance 
in order to recover the failed segment. The default is to 
only copy over the incremental changes that occurred while 
the segment was down.

--differential

Optional. Perform a differential copy of the active segment instance
in order to recover the failed segment. The default is to only copy
over the incremental changes that occurred while the segment was down.

--max-rate <max_rate>

Optional. Specifies the max-rate for full recovery of failed segments.
Rate is in kB/s, suffix `k` or `M` can be used to signify kB/s or MB/s.
Valid rate ranges from 32kB/s to 1048576kB/s(1024MB/s). The default is to use the
entire available network bandwidth to perform full recovery of failed segment.

-i <recover_config_file>

Specifies the name of a file with the details about failed segments to recover. 
Each line in the file is in the following format. The word SPACE indicates 
the location of a required space. Do not add additional spaces.

<failed_host_address>|<port>|<data_directory>SPACE
<recovery_host_address>|<port>|<data_directory>

Comments
Lines beginning with # are treated as comments and ignored.

Segments to Recover
Each line after the first specifies a segment to recover. This line can 
have one of three formats. In the event of in-place recovery, enter one
group of colon delimited fields in the line:

failedAddress|failedPort|failedDataDirectory

For recovery to a new location, enter two groups of fields separated by 
a space in the line. The required space is indicated by SPACE. Do not 
add additional spaces.

failedAddress|failedPort|failedDataDirectorySPACEnewAddress|newPort|
newDataDirectory

To do mix recovery, include a field with values I/D/F/i/d/f on each line. Default is
incremental recovery if recovery_type field is not provided.

recoveryType|failedAddress|failedPort|failedDataDirectory

recoveryType field supports below values:
    I/i for incremental recovery
    D/d for differential recovery
    F/f for full recovery


Examples

In-place recovery of a single mirror

sdw1-1|50001|/data1/mirror/gpseg16

Recovery of a single mirror to a new host

sdw1-1|50001|/data1/mirror/gpseg16 SPACE sdw4-1|50001|51001|/data1/recover1/gpseg16

Recovery of a single mirror to a new host with hostname specified:
sdw1|sdw1-1|50001|/data1/mirror/gpseg16 SPACE
sdw4|sdw4-1|50001|51001|/data1/recover1/gpseg16

In-place recovery of down mirrors with recovery type:
sdw1-1|50001|/data1/mirror/gpseg1   // Does incremental recovery (By default)
I|sdw1-1|50001|/data1/mirror/gpseg1   // Does incremental recovery
i|sdw1-1|50001|/data1/mirror/gpseg1   // Does incremental recovery
D|sdw2-1|50002|/data1/mirror/gpseg2   // Does Differential recovery
d|sdw2-1|50002|/data1/mirror/gpseg2   // Does Differential recovery
F|sdw3-1|50003|/data1/mirror/gpseg3   // Does Full recovery
f|sdw3-1|50003|/data1/mirror/gpseg3   // Does Full recovery

Obtaining a Sample File

You can use the -o option to output a sample recovery configuration file 
to use as a starting point.

-l <logfile_directory>

The directory to write the log file. Defaults to ~/gpAdminLogs.

-o <output_recover_config_file>

Specifies a file name and location to output a sample 
recovery configuration file. The output file lists the 
currently invalid segments and their default recovery 
location in the format that is required by the -i option. 
Use together with the -p option to output a sample file 
for recovering to a different host. This file can be 
edited to supply alternate recovery locations if needed.


-p <new_recover_host>[,...]

Specifies a spare host outside of the currently configured 
Greenplum Database array on which to recover invalid segments. In 
the case of multiple failed segment hosts, you can specify a 
comma-separated list. The spare host must have the Greenplum Database 
software installed and configured, and have the same hardware and OS 
configuration as the current segment hosts (same OS version, locales, 
gpadmin user account, data directory locations created, ssh keys 
exchanged, number of network interfaces, network interface naming 
convention, and so on.).


-q

Run in quiet mode. Command output is not displayed on 
the screen, but is still written to the log file.


-r

After a segment recovery, segment instances may not be returned to the 
preferred role that they were given at system initialization time. This 
can leave the system in a potentially unbalanced state, as some segment 
hosts may have more active segments than is optimal for top system performance. 
This option rebalances primary and mirror segments by returning them to 
their preferred roles. All segments must be valid and synchronized before 
running gprecoverseg -r. If there are any in progress queries, they will 
be cancelled and rolled back.

--replay-lag
Replay lag(in GBs) allowed on mirror when rebalancing the segments. Default is 10 GB. If
the replay_lag (flush_lsn-replay_lsn) is more than the value provided with this option
then rebalance will be aborted.


--disable-replay-lag
Disable replay lag check when rebalancing segments


-s

Show pg_rewind/pg_basebackup progress sequentially instead of inplace. Useful
when writing to a file, or if a tty does not support escape sequences. Defaults
to showing the progress inplace.


--no-progress

Do not show pg_rewind/pg_basebackup progress. Useful when writing to a
file. Defaults to showing the progress.


--hba-hostnames

Optional. use hostnames instead of CIDR in pg_hba.conf


-v

Sets logging output to verbose.

--version

Displays the version of this utility.

-?
Displays the online help.


******************************************************
EXAMPLES
******************************************************

Recover any failed segment instances in place:

 $ gprecoverseg


Rebalance your Greenplum Database system after a recovery by
resetting all segments to their preferred role. First check
that all segments are up and synchronized.

  $ gpstate -m

  $ gprecoverseg -r


Recover any failed segment instances to a newly configured 
spare segment host:

  $ gprecoverseg -i recover_config_file


Output the default recovery configuration file:

  $ gprecoverseg -o /home/gpadmin/recover_config_file


******************************************************
SEE ALSO
******************************************************

gpstart, gpstop, gpstate
