The script P4Transfer.py solves the problem: how to transfer changes between unrelated and unconnected Perforce Helix Core repositories.
If you have a 2015.1 or greater Helix Core server then you can use the Helix native DVCS commands (p4 clone
/p4 fetch
/p4 push
). If you have a pre-2015.1 version of Helix Core (or very large repository sizes to be transferred), then P4Transfer.py
may be your best option! See next section for guidance.
While best practice is usually to have only one Perforce Server, the reality is often that many different Perforce Servers are required. A typical example is the Perforce public depot, which sits outside the Perforce Network.
Sometimes you start a project on one server and then realize that it would be nice to replicate these changes to another server. For example, you could start out on a local server on a laptop while being on the road but would like to make the results available on the public server. You may also wish to consolidate various Perforce servers into one with all their history (e.g. after an acquisition of another company).
Both Helix native DVCS and P4Transfer enable migration of detailed file history from a set of paths on one Helix Core server into another server.
Helix native DVCS has additional functionality, such as creating a personal micro-repo for personal use without a continuously running p4d
process.
Important
|
When it is an option, using Helix native DVCS is preferred. But if that can’t be used for some reason, P4Transfer can pretty much always be made to work. |
Tip
|
The new FetchTransfer.py script is a wrapper around the DVCS p4 fetch command with some advantages over P4Transfer.
|
Pros for native DVCS:
-
Helix native DVCS is a fully supported product feature.
-
Helix native DVCS requires consistency among the Helix Core servers: Same case sensitivity setting, same Unicode mode, and same P4D version (with some exceptions).
-
Helix native DVCS is easy to setup and use.
-
Helix native DVCS requires particular configurables to be setup, and may be viewed as a security concern for the extraction of history (this can be controlled with pre-command triggers)
-
Helix native DVCS has some limitations at larger scales (depends on server RAM as it creates zip files) - won’t scale to TB requiring to be transferred.
-
Helix native DVCS is generally deemed to produce the highest possible quality data on the target server, as it transfers raw journal data and archive content.
-
Helix native DVCS is very fast.
-
Helix native DVCS has no external dependencies.
-
Helix native DVCS bypasses submit triggers which can interfere with P4Transfer.
When P4Transfer might be a better option:
-
If Helix native DVCS hits data snags, there may not be an easy way to work around them, unless a new P4D server release address them.
-
Helix native DVCS must be explicitly enabled. If not already enabled, it requires 'super' access on both Helix Core servers involved in the process.
-
P4Transfer is community supported (and with paid Consulting engagements).
-
P4Transfer can make timestamps more accurate if it has 'admin' level access, but does not require it.
-
P4Transfer automates front-door workflows, i.e. it does would a human with a
p4
command line client, fleet fingers and a lot of time could do. -
P4Transfer can work with mismatched server data sets (different case sensitivity, Unicode mode, or P4D settings) so long as the actual data to be migrated doesn’t cause issues. (For example, if you try to copy paths that actually do have Korean glyphs in the path name to a non-Unicode server, that ain’t gonna work).
-
If P4Transfer does have issues with importing some data (like the above example with Unicode paths), you can manually work around those snags, and then pick up with the next changelist. If there aren’t too many snags, this trial and error process is viable.
-
P4Transfer is reasonably fast as a front-door mechanism.
-
P4Transfer can be run as a service for continuous operation, and has successfully run for months or more.
-
P4Transfer data quality is excellent in terms of detail (e.g. integration history, resolve options, etc.). However, it is an emulation of history rather than a raw replay of actually history in the native format as done by Helix native DVCS. The nuance in history differences rarely has practical implications.
-
P4Transfer requires more initial setup than Helix native DVCS.
-
P4Transer requires Python (2.7 or 3.6+).
-
P4Transfer can be interfered with by custom policy enforcement triggers on the target server.
-
P4Transfer is subject to submit triggers that may block its operation, possibly intentionally.
-
Helix native DVCS requires a direct connection betwen the related p4d servers, whereas P4Transfer requires only a connection from a client machine to each of the p4d servers. Thus, P4Transfer can operate as long as it operates in an environment where it can reach each of the target p4d servers, even of those p4d servers cannot communicate directly with each other, e.g. due to network firewalls.
The basic idea is to create a client workspace on each Perforce Server that maps the projects to be transferred. Both client workspaces must share the same root directory and client side mapping. For example:
Source client:
Client: workspace_server1
Root: /work/transfer
View:
//depot/myproject/dev/... //workspace_server1/depot/myproject/dev/...
//depot/other/dev/... //workspace_server1/depot/other/dev/...
Target client:
Client: workspace_server2
Root: /work/transfer
View:
//import/mycode/... //workspace_server2/depot/myproject/dev/...
//import/stuff/... //workspace_server2/depot/other/dev/...
These client workspaces are created automatically from the view
entries in the config file described below.
P4Transfer works uni-directionally. The tool will inquire the changes for the workspace files and compare these to a counter.
P4Transfer uses a single configuration file that contains the information of both servers as well as the current counter values. The tool maintains its state counter using a Perforce counter on the target server (thus requiring review
privilege as well as write
privilege – by default it assumes super
user privilege is required since it updates changelist owners and date/time to the same as the source – this functionality is controlled by the config file).
It is now possible to migrate streams to streams. This is done by creating a source workspace which can read any specified stream (it is just a normal workspace since syncing from any stream is possible). The target workspace is a special stream workspace which is a mainline with no share …
line in it, but which uses import+
to write to all the desired target streams.
Notes/restrictions on streams:
-
Currently supports streams→streams only (not classic to streams)
-
Takes whole streams rather than allowing filtering of source streams
-
Will auto-create target streams if necessary when you use wildcard matching ('*') in source/target stream names, e.g. allowing you to take all
//streams_depot/release*
streams across easily.
Important
|
Remember that humans should not be writing to the targets of a P4Transfer instance (streams or otherwise) - or the script may fail in strange ways! |
You will need Python 2.7 or 3.6+ and P4Python 2017.2+ to make this script work.
The easiest way to install P4Python is probably using "pip" (or "pip3") – make sure this is installed. Then:
pip install p4python
Tip
|
If the above needs to build and fails, then this usually works for Python 3.6: pip3 install p4python==2017.2.1615960
|
Alternatively, refer to P4Python Docs
If you are on Windows, then look for an appropriate version on the Perforce ftp site (for your Python version), e.g. http://ftp.perforce.com/perforce/r20.1/bin.ntx64/
The easiest thing to do is to download this repo either by:
-
running
git clone https://github.com/perforce/p4transfer
-
or by downloading the project zip file and unzipping.
The minimum requirements are the modules P4Transfer.py
and logutils.py
If you have installed P4Python as above, then check the requirements.txt
for other modules to install via pip
or pip3
.
Note that if running it on Windows, and especially if the source server has filenames containing say umlauts or other non-ASCII characters, then Python 2.7 is required currently due to the way Unicode is processed. Python 3.6+ on Mac/Unix should be fine with Unicode as long as you are using P4Python 2017.2+
Create the workspaces for both servers, ensuring that the root directories and client views match.
Now initialize the configuration file, by default called transfer.cfg
. This can be generated by the script:
python3 P4Transfer.py --sample-config > transfer.yaml
Then edit the resulting file, paying attention to the comments.
The password stored in P4Passwd is optional if you do not want to rely on tickets. The tool performs a login if provided with a password, so it should work with security=3
or auth_check
trigger set.
Note that although the workspaces are named the same for both servers in this example, they are completely different entities.
A typical run of the tool would produce the following output:
C:\work\> python3 P4Transfer.py -c transfer.yaml -r
2014-07-01 15:32:34,356:P4Transfer:INFO: Transferring 0 changes
2014-07-01 15:32:34,361:P4Transfer:INFO: Sleeping for 1 minutes
If there are any changes missing, they will be applied consecutively.
P4Transfer has various options – these are documented via the -h
or --help
parameters.
The following text may not display properly if your are viewing this P4Transfer.adoc file in GitHub. Please refer the the .pdf version instead, or open up help.txt file directly.
link:help.txt[role=include]
-
--notransfer
- useful to validate your config file and that you can connect to source/target p4d servers, and report on how many changes might be transferred. -
--maximum
- useful to perform a test transfer of a single changelist when you get started (although remember this might be a changelist with a lot of files!) -
--keywords
- useful to avoid issues with expanding of keywords on a different server. Keywords make it hard to compare source/target results, and make transfers slower due to extra work required. -
--end-datetime
- useful to schedule a run of P4Transfer and have it stop at the desired time (e.g. run overnight and stop when users start work in the morning). Useful for scheduling long running transfers (can be many days) in quiet periods (e.g. together with Linuxat
command)
On Linux, recommend you execute it as a background process, and then monitor the output:
nohup python3 P4Transfer.py -c transfer.yaml -r > out1 &
This will run in the background, and poll for new changes (according to poll_interval
in the config file.)
You can look at the output file for progress, e.g. (avoiding long lines of text which can be output), or grep the log file:
tail -f out1 | cut -c -140
grep :INFO: log-P4Transfer-*.log
Note that if you edit the configuration file, it will be re-read the next time the script wakes up and polls for input. So you do not need to stop and restart the job.
The following simple setup will allow you to cross check easily source and target servers.
Assume we are in a directory: /some/path/p4transfer
export P4CONFIG=.p4config mkdir source target cd source vi .p4config
and create appropriate values as per your config file for the source server e.g.:
cat .p4config P4PORT=source-server:1666 P4USER=p4transfer P4CLIENT=p4transfer_client
And similarly create a file in the target
sub-directory.
This will allow you to quickly and easily cd
between directories and be able to run commands against respective
source and target p4d instances.
The comments in the file are mostly self-explanatory. It is important to specify the main values for the [source]
and [target]
sections.
P4Transfer.py --sample-config > transfer.yaml
cat transfer.yaml
The following included text may not display correctly when this .adoc file is viewed in GitHub - instead download the PDF version of this doc, or open transfer.yaml directly.
link:transfer.yaml[role=include]
In the [general]
section, you can customize the change_description_format
value to decide how transferred change descriptions are formatted.
Keywords in the format string are prefixed with $
. Use \n
for newlines. Keywords allowed are: $sourceDescription
, $sourceChange
, $sourcePort
, $sourceUser
.
Assume the source description is “Original change description”.
Default format:
$sourceDescription\n\nTransferred from p4://$sourcePort@$sourceChange
might produce:
Original change description
Transferred from p4://source-server:1667@2342
Custom format:
Originally $sourceChange by $sourceUser on $sourcePort\n$sourceDescription
might produce:
Originally 2342 by FBlogs on source-server:1667 Original change description
There is an option in the configuration file to specify a change_map_file. If you set this option (default is blank), then P4Transfer will append rows to the specified CSV file showing the relationship between source and target changelists, and will automatically check that file in after every process.
change_map_file = depot/import/change_map.csv
The result change map file might look something like this:
$ head change_map.csv
sourceP4Port,sourceChangeNo,targetChangeNo
src-server:1666,1231,12244
src-server:1666,1232,12245
src-server:1666,1233,12246
src_server:1666,1234,12247
src-server:1666,1235,12248
It is very straight forward to use standard tools such as grep to search this file. Because it is checked in to the target server, you can also use “p4 grep”.
Important
|
You will need to ensure that the change_map filename is properly mapped in the local workspace - thus it must
include <depot> and other pathname components in the path. When you have created your target workspace, run p4 client -o
to check the view mapping.
|
Note that since labeling itself is not versioned no labels or tags are transferred.
Branching and integrating with is implemented, as long as both source and target are within the workspace view. Otherwise, the integrate action is downgraded to an add or edit.
P4Transfer can be setup as a service on Windows using srvinst.exe
and srvanay.exe
to wrap the Python interpreter, or NSSM - The Non-Sucking Service Manager
Please contact [email protected]
for more details.
Any errors in the script are highly likely to be due to some unusual integration history, which may have been done with an older version of the Perforce server.
If you have an error when running the script, please use summarise_log.sh to create a summary log file to send. E.g.
summarise_log.sh log-P4Transfer-20141208094716.log > sum.log
If you get an error message in the log file such as:
P4TLogicException: Replication failure: missing elements in target changelist: /work/p4transfer/main/applications/util/Utils.java
or
P4TLogicException: Replication failure: src/target content differences found: rev = 1 action = branch type = text depotFile = //depot/main/applications/util/Utils.java
Then please also send the following:
A Revision Graph screen shot from the source server showing the specified file around the changelist which is being replicated. If an integration is involved then it is important to show the source of the integration.
Filelog output for the file in the source Perforce repository, and filelog output for the source of the integrate being performed. e.g.
p4 -ztag filelog /work/p4transfer/main/applications/util/Utils.java@12412 p4 -ztag filelog /work/p4transfer/dev/applications/util/Utils.java@12412
where 12412 is the changelist number being replicated when the problem occurred.
When an error has been fixed, you can usually re-start P4Transfer from where it left off. If the error occurred when validating changelist say 4253 on the target (which was say 12412 on the source) but found to be incorrect, the process is:
p4 -p target-p4:1666 -u transfer_user -c transfer_workspace obliterate //transfer_workspace/...@4253,4253
(re-run the above with the -y flag to actually perform the obliterate)
Ensure that the counter specified in your config file is set to a value less than 4253 such as the changelist immediately prior to that changelist. Then re-run P4Transfer as previously.
Pull Requests are welcome. Code changes should normally be accompanied by tests.
See TestP4Transfer.py for unit/integration tests.
Most tests generate a new p4d
repository with source/target servers and run test transfers.
The use the "rsh" hack to avoid having to spawn p4d on a port.
Pick your single test class (e.g. testAdd
):
python3 TestP4Transfer.py TestP4Transfer.testAdd
This will:
-
generate a single log file:
log-TestP4Transfer-*.log
-
create a test sub-directory
_testrun_transfer
with the following structure:source/ server/ # P4ROOT and other files for server - uses rsh hack for p4d client/ # Root of client used to checkin files .p4config # Defines P4PORT for source server target/ # Similar structure to source/ transfer_client/ # The root of shared transfer client
This test directory is created new for each test, and then left behind in case of test failures.
If you want to manually do tests or view results, then export P4CONFIG=.p4config
, and cd
into
the source/target directory to be able to run normal p4
commands as appropriate.