[Pgcluster-general] replication server gets stuck when a deadlock occurs

Dmitry Deniskin reggiman at yandex.ru
Thu Jul 5 11:13:34 UTC 2007


 Hello,

 We use 2 database nodes and 1 replication server.
A replication server gets stuck when a deadlock occurs.

We tried 1.7.0rc7 and 1.5.0.rc16 versions of PgCluster.
 
Production enviroment:
- Debian Etch amd64
- PgCluster v.1.7.0rc7

- node 1 uses dual-core Intel Xeon (2 cpu)
- node 2 uses dual-core Intel Xeon (1 cpu)
- replicaiton server uses Intel Core2Duo E6320 (1 cpu)

We had no bugs in testing enviroment (VmWare server).



Here are the configuration files:

replicator:
<Cluster_Server_Info>
    <Host_Name>                 node1.ourdomain.com.              </Host_Name>
    <Port>                      5432                            </Port>
    <Recovery_Port>             7001                            </Recovery_Port>
</Cluster_Server_Info>
<Cluster_Server_Info>
    <Host_Name>                 node2.ourdomain.com.      </Host_Name>
    <Port>                      5432                            </Port>
    <Recovery_Port>             7001                            </Recovery_Port>
</Cluster_Server_Info>
<Host_Name>                     replicator1.ourdomain.com.                </Host_Name>
<Replication_Port>              8001                            </Replication_Port>
<Recovery_Port>         8101                            </Recovery_Port>
<RLOG_Port>                     8301                            </RLOG_Port>
<Response_Mode>         reliable                                </Response_Mode>
<Use_Replication_Log>           no                              </Use_Replication_Log>
<Replication_Timeout>           5s                              </Replication_Timeout>
<LifeCheck_Timeout>             1s                              </LifeCheck_Timeout>
<LifeCheck_Interval>            15s                             </LifeCheck_Interval>
<Log_File_Info>
        <File_Name>             /tmp/pgreplicate.log    </File_Name>
        <File_Size>             1M                      </File_Size>
        <Rotate>                3                       </Rotate>
</Log_File_Info>


Node1:

<Replicate_Server_Info>
        <Host_Name>             replicator1.ourdomain.com </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     node1.ourdomain.com               </Host_Name>
<Recovery_Port>         7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes                             </Rsync_Compress>
<Pg_Dump_Path>                  /usr/local/pgsql/bin/pg_dump    </Pg_Dump_Path>
<When_Stand_Alone>              read_only                       </When_Stand_Alone>
<Replication_Timeout>           1 min                           </Replication_Timeout>
<LifeCheck_Timeout>             3s                              </LifeCheck_Timeout>
<LifeCheck_Interval>            11s                             </LifeCheck_Interval>

Node2:

<Replicate_Server_Info>
        <Host_Name>             replicator1.ourdomain.com </Host_Name>
        <Port>                  8001                            </Port>
        <Recovery_Port>         8101                            </Recovery_Port>
</Replicate_Server_Info>
<Host_Name>                     node2.ourdomain.com               </Host_Name>
<Recovery_Port>         7001                            </Recovery_Port>
<Rsync_Path>                    /usr/bin/rsync                  </Rsync_Path>
<Rsync_Option>                  ssh                             </Rsync_Option>
<Rsync_Compress>                yes                             </Rsync_Compress>
<Pg_Dump_Path>                  /usr/local/pgsql/bin/pg_dump    </Pg_Dump_Path>
<When_Stand_Alone>              read_write                      </When_Stand_Alone>
<Replication_Timeout>           1 min                           </Replication_Timeout>
<LifeCheck_Timeout>             3s                              </LifeCheck_Timeout>
<LifeCheck_Interval>            11s                             </LifeCheck_Interval>


-- 
Dmitry



More information about the Pgcluster-general mailing list