Mercurial > CVu-Mercurial
annotate Hg.txt @ 1:608947872f72
Add intro and some minor edits.
author | Jim Hague <jim.hague@acm.org> |
---|---|
date | Thu, 11 Dec 2008 11:17:15 +0000 |
parents | 48d338d29ce9 |
children | ee7f1e2c01a6 |
rev | line source |
---|---|
0 | 1 Inside a distributed version control system |
2 =========================================== | |
3 | |
4 Grinton Lodge is a Youth Hostel that sits on an exposed hillside just | |
5 above the small hamlet of Grinton in Swaledale, in the Yorkshire Dales | |
6 National Park. A former Victorian shooting lodge, it now welcomes | |
7 walkers and other travellers from around the world. | |
8 | |
9 Tonight, a Wednesday in mid-November, is not one of its busiest | |
10 nights. Kat, the duty staff member, tells me that there is a small | |
11 corporate team-building group in the annex. There's no sign of them at | |
12 present. Otherwise, that portion of the world that has beaten a path | |
13 to the door of this grand building today consists of just me. And Kat | |
14 goes home soon. | |
15 | |
16 The November CVu, removed from its wrappers and read yesterday, lies | |
17 in my bag. Taunting me. Go on, it says, if you've ever going to put | |
18 finger to keyboard in the name of CVu, well, tonight you are out of | |
19 excuses. | |
20 | |
21 Bugger. | |
22 | |
23 Let's look into Mercurial | |
24 ------------------------- | |
25 | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
26 If you're at all interested in version control systems - and any |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
27 software developer not using one daily is a strange beast indeed - |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
28 you'll at least have become vaguely aware in the last few years of the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
29 growing maturity of the latest group of version control systems |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
30 offering funky new stuff. These are the distributed version control |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
31 systems (DVCS). There is more to them than just their headline |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
32 attributes, being able to check history and do checkins while |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
33 disconnected from a central server, but these are damm useful to start |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
34 with. |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
35 |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
36 When I first heard about DVCS, it wasn't immediately obvious to me (to |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
37 put it mildly) how they would work. After years of using a centralised |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
38 version control system, I had rough mental model of what went on. But |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
39 how do you cope without the central server forcing ordering onto the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
40 changes? |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
41 |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
42 Since then I've started using Mercurial. Mercurial is a DVCS. It's one |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
43 of three DVCSs that have gained significant popularity in the last few |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
44 years, the other two being Git and Bazaar. I switched a significant |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
45 work project over to Mercurial (from Subversion) over a year ago, |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
46 because a customer site required on-site work but could not allow |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
47 access back to the company VPN. I chose Mercurial for a variety of |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
48 reasons which I won't bore you with here. If you must know, see the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
49 box. |
0 | 50 |
51 What I want to do in this article is give you an insight into how a | |
52 DVCS works. OK, so specifically I'm going to be talking about | |
53 Mercurial, but Git and Bazaar attack the problem in a similar way. But | |
54 first I'd better give you some idea of how you use Mercurial. | |
55 | |
56 :::: | |
57 Box: OK, if you must know: | |
58 | |
59 o Implementability. I needed the system to work on Windows, Linux and | |
60 AIX. The latter was not one of the directly supported platforms for | |
61 any of the candidates. Git's implementation uses a horde of | |
62 tools. Bazaar requires only Python, but required Python 2.4 while IBM | |
63 stubbornly still supplies only Python 2.3. Mercurial requires Python | |
64 2.3 or greater, and uses some C for speed. | |
65 | |
66 o Simplicity. From the command line, Mercurial's core operations will | |
67 be familiar to a Subversion user. This is also true of Bazaar, but was | |
68 less true of Git. Git has improved in this matter since then, but a Mr | |
69 Winder of this parish tells me that it's still possible to seriously | |
70 embarass yourself. There was also a lack of Windows support for Git at | |
71 the time. | |
72 | |
73 o Speed. Mercurial is fast. In the same ballpark as Git. Bazaar | |
74 wasn't, and although it has improved significantly, has, in my | |
75 estimation, added user complexity in the process, and is still off the | |
76 pace for some operations. | |
77 | |
78 o Documentation. At the time, Bryan O'Sullivan's excellent Mercurial | |
79 book (http://hgbook.red-bean.com) was a clear winner for best | |
80 documentation. | |
81 :::: | |
82 | |
83 The 5 minute Mercurial overview | |
84 ------------------------------- | |
85 | |
86 I think it unlikely that someone possessing the taste and discernment | |
87 to be reading CVu would not be familiar with at least one version | |
88 control system. So, while I want to give you a flavour of what it's | |
89 like to use, I'm not going to hang about. If you'd like a proper | |
90 introduction, or you don't follow something, I thoroughly recommend | |
91 you consult the Mercurial book. | |
92 | |
93 To start using Mercurial to keep track of a project. | |
94 | |
95 $ hg init | |
96 $ | |
97 | |
98 This creates the repository root in the current directory. | |
99 | |
100 Like CVS with its CVS directory and Subversion with its .svn | |
101 directory, Mercurial keeps its private data in a directory. Mercifully | |
102 there is only one of these, in the top level of your project. And | |
103 rather than holding details of where the actual repository is to be | |
104 found, the .hg directory holds the entire repository. | |
105 | |
106 Next you need to specify the files you want Mercurial to track. | |
107 | |
108 $ echo "There was a gibbon one morning" > pome.txt | |
109 $ hg add pome.txt | |
110 $ | |
111 | |
112 As you might expect, this marks the files as to be added. And as you | |
113 might also expect, you need to commit to record the added files in the | |
114 repository. The commit comment can be supplied on the command line; if | |
115 you don't supply a comment, you'll be dropped into an editor to | |
116 provide one. | |
117 | |
118 There is a suggested format for these messages - a one line summary | |
119 followed by any more required detail on following lines. By default | |
120 Mercurial will only display the first line of commit messages when | |
121 listing changes. In these examples I'll stick to terse messages, and | |
122 I'll enter them from the command line. | |
123 | |
124 $ hg commit -m "My Pome" -u "Jim Hague <jim.hague@acm.org>" | |
125 $ | |
126 | |
127 Mercurial records the user making the change as part of the change | |
128 information. It is usual to give your name and email address as I've | |
129 done here. You can imagine, though, that constantly having to repeat | |
130 this is a bit tedious, so you can set a default user name in a | |
131 configuration file. Mercurial keeps global, user and repository | |
132 configurations, and it can go in any of those. | |
133 | |
134 As with Subversion, after further edits you see how your working copy | |
135 differs from the repository. | |
136 | |
137 $ hg status | |
138 M pome.txt | |
139 $ hg diff | |
140 diff -r 33596ef855c1 pome.txt | |
141 --- a/pome.txt Wed Apr 23 22:36:33 2008 +0100 | |
142 +++ b/pome.txt Wed Apr 23 22:48:01 2008 +0100 | |
143 @@ -1,1 +1,2 @@ There was a gibbon one morning | |
144 There was a gibbon one morning | |
145 +said "I think I will fly to the moon". | |
146 $ hg commit -m "A great second line" | |
147 $ | |
148 | |
149 And look through a log of changes. | |
150 | |
151 $ hg log | |
152 changeset: 1:3d65e7a57890 | |
153 tag: tip | |
154 user: Jim Hague <jim.hague@acm.org> | |
155 date: Wed Apr 23 22:49:10 2008 +0100 | |
156 summary: A great second line | |
157 | |
158 changeset: 0:33596ef855c1 | |
159 user: Jim Hague <jim.hague@acm.org> | |
160 date: Wed Apr 23 22:36:33 2008 +0100 | |
161 summary: My Pome | |
162 | |
163 $ | |
164 | |
165 There are some items here that need an explanation. | |
166 | |
167 The changeset identifer is in fact two identifiers separated by a | |
168 colon. The first is the sequence number of the changeset in the | |
169 repository, and is directly comparable to the change number in a | |
170 Subversion repository. The second is a globally unique identifier for | |
171 that change. As the change is copied from one repository to another | |
172 (this is a distributed system, remember, even if we haven't come to | |
173 that bit yet), its sequence number in any particular repository will | |
174 change, but the global identifier will always remain the same. | |
175 | |
176 'tip' is a Mercurial term. It means simply the most recent change. | |
177 | |
178 Want to rename a file? | |
179 | |
180 $ hg mv pome.txt poem.txt | |
181 $ hg status | |
182 A poem.txt | |
183 R pome.txt | |
184 $ hg commit -m "Rename my file" | |
185 $ | |
186 | |
187 (The command to rename a file is actually 'hg rename', but Mercurial | |
188 saves Unix-trained fingers from typing embarrassment.) | |
189 | |
190 At this point you may be wondering about directories. 'hg mkdir' | |
191 perhaps? Well, no. Mercurial only tracks files. To be sure, the | |
192 directory a file occupies is tracked, but effectively only as a | |
193 component of the file name. This has the slightly unexpected result | |
194 that you can't record an empty directory in your repository. | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
195 |
0 | 196 (Footnote: I tripped over this converting a work Subversion |
197 repository. One possibility is to create a placemaker file in the | |
198 directory. In the event I created the directory (which receives build | |
199 products) as part of the build instead.) | |
200 | |
201 Given this, and the status output above that suggests strongly that | |
202 Mercurial treats a rename as a copy followed by a delete, you may be | |
203 worried that Mercurial won't cope at all well with rearranging your | |
204 repository. Relax. Mercurial does store the details of the rename as | |
205 part of the changeset, and copes very well with rearrangements. | |
206 | |
207 (Footnote: The Mercurial designers justify not dealing with | |
208 directories as first class objects by pointing out that provided you | |
209 can correctly move files about in the tree, the other reasons for | |
210 tracking directories are uncommon and do not in their opinion justify | |
211 the considerable added complexity. So far I've found no reason to | |
212 doubt that judgement.) | |
213 | |
214 Want to rewind the working copy to a previous revision? | |
215 | |
216 $ hg update -r 1 | |
217 1 files updated, 0 files merged, 1 files removed, 0 files unresolved | |
218 $ | |
219 | |
220 'hg update' updates the working files. In this case I'm specifying | |
221 that I want to go back to local changeset 1. I could also have typed | |
222 '-r 3d65e7a57890', or even '-r 3d'; when specifying the global change | |
223 identifier you only need to type enough digits to make it unique. | |
224 | |
225 This is all very well, but it's not exactly distributed, is it? | |
226 | |
227 Copy an existing repository: | |
228 | |
229 elsewhere$ hg clone ssh://jim.home.net/Poem Jim-Poem | |
230 updating working directory | |
231 1 files updated, 0 files merged, 0 files removed, 0 files unresolved | |
232 | |
233 (You can access other repositories via the file system, over http or | |
234 over ssh). | |
235 | |
236 elsewhere$ cd Jim-Poem | |
237 elsewhere$ hg log | |
238 changeset: 3:a065eb26e6b9 | |
239 tag: tip | |
240 user: Jim Hague <jim.hague@acm.org> | |
241 date: Thu Apr 24 18:52:31 2008 +0100 | |
242 summary: Rename my file | |
243 | |
244 changeset: 2:ff97668b7422 | |
245 user: Jim Hague <jim.hague@acm.org> | |
246 date: Thu Apr 24 18:50:22 2008 +0100 | |
247 summary: Finished first verse | |
248 | |
249 changeset: 1:3d65e7a57890 | |
250 user: Jim Hague <jim.hague@acm.org> | |
251 date: Wed Apr 23 22:49:10 2008 +0100 | |
252 summary: A great second line | |
253 | |
254 changeset: 0:33596ef855c1 | |
255 user: Jim Hague <jim.hague@acm.org> | |
256 date: Wed Apr 23 22:36:33 2008 +0100 | |
257 summary: My Pome | |
258 | |
259 'hg clone' is aptly named. It creates a new repository that contains | |
260 exactly the same changes as the source repository. You can make a | |
261 clone just by copying your project directory, if you're confident | |
262 nothing else will access it during the copy. 'hg clone' saves you this | |
263 worry, and sets the default push/pull location in the new repo to the | |
264 cloned repo. | |
265 | |
266 From that point, you use 'hg pull' to collect changes from other | |
267 places into your repo (though note it does not by default update your | |
268 working copy), and, as you might guess, 'hg push' shoves your changes | |
269 into a foreign repository. By default these will act on the repository | |
270 you cloned from, but you can specify any other repository. | |
271 | |
272 More on those in a moment. First, though, I want to show you something | |
273 you can't do in Subversion. Start with the repository with 4 changes | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
274 we just cloned. I want to focus on the first couple of lines, so I'll |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
275 wind the working copy back to the point where only those lines exist. |
0 | 276 |
277 $ hg update -r 1 | |
278 1 files updated, 0 files merged, 1 files removed, 0 files unresolved | |
279 | |
280 And make a change. | |
281 | |
282 $ hg diff | |
283 diff -r 3d65e7a57890 pome.txt | |
284 --- a/pome.txt Wed Apr 23 22:49:10 2008 +0100 | |
285 +++ b/pome.txt Thu Apr 24 19:13:14 2008 +0100 | |
286 @@ -1,2 +1,2 @@ There was a gibbon one morning | |
287 -There was a gibbon one morning | |
288 -said "I think I will fly to the moon". | |
289 +There was a baboon who one afternoon | |
290 +said "I think I will fly to the sun". | |
291 $ hg commit -m "Better first two lines" | |
292 $ | |
293 | |
294 The alert among you will have sat up at that. Well done! Yes, there's | |
295 something very worrying. How can I commit a change at an old point? | |
296 If you try this in Subversion, it will complain mightily about your | |
297 file being out of date. But Mercurial just went ahead and did | |
298 something. The Bazaar experts among you will know that in Bazaar, if | |
299 you use 'bzr revert -r' to bring the working copy to a past revision, | |
300 make a change and commit, then your latest version will be the past | |
301 revision plus your change. Perhaps that's what Mercurial did? | |
302 | |
303 No. What Mercurial did is central to Mercurial's view of the | |
304 world. You took your working copy back to an old changeset, and the | |
305 committed a fresh change based at that changeset. Mercurial actually | |
306 did just what you asked it to do, no more and no less. Let's see the | |
307 initial evidence. | |
308 | |
309 $ hg heads | |
310 changeset: 4:267d32f158b3 | |
311 tag: tip | |
312 parent: 1:3d65e7a57890 | |
313 user: Jim Hague <jim.hague@acm.org> | |
314 date: Thu Apr 24 19:13:59 2008 +0100 | |
315 summary: Better first two lines | |
316 | |
317 changeset: 3:a065eb26e6b9 | |
318 user: Jim Hague <jim.hague@acm.org> | |
319 date: Thu Apr 24 18:52:31 2008 +0100 | |
320 summary: Rename my file | |
321 | |
322 $ | |
323 | |
324 Time for some more Mercurial terminology. You can think of a 'head' in | |
325 Mercurial as the most recent change on a branch. In Mercurial, a | |
326 branch is simply what happens when you commit a change that has as its | |
327 parent a change that already has a child. Mercurial has a standard | |
328 extension 'hg glog' which uses some ASCII art to show the current | |
329 state: | |
330 | |
331 $ hg glog | |
332 @ changeset: 4:267d32f158b3 | |
333 | tag: tip | |
334 | parent: 1:3d65e7a57890 | |
335 | user: Jim Hague <jim.hague@acm.org> | |
336 | date: Thu Apr 24 19:13:59 2008 +0100 | |
337 | summary: Better first two lines | |
338 | | |
339 | o changeset: 3:a065eb26e6b9 | |
340 | | user: Jim Hague <jim.hague@acm.org> | |
341 | | date: Thu Apr 24 18:52:31 2008 +0100 | |
342 | | summary: Rename my file | |
343 | | | |
344 | o changeset: 2:ff97668b7422 | |
345 |/ user: Jim Hague <jim.hague@acm.org> | |
346 | date: Thu Apr 24 18:50:22 2008 +0100 | |
347 | summary: Finished first verse | |
348 | | |
349 o changeset: 1:3d65e7a57890 | |
350 | user: Jim Hague <jim.hague@acm.org> | |
351 | date: Wed Apr 23 22:49:10 2008 +0100 | |
352 | summary: A great second line | |
353 | | |
354 o changeset: 0:33596ef855c1 | |
355 user: Jim Hague <jim.hague@acm.org> | |
356 date: Wed Apr 23 22:36:33 2008 +0100 | |
357 summary: My Pome | |
358 | |
359 $ | |
360 | |
361 'hg view' shows a nicer graphical view. (Footnote: Though, being | |
362 Tcl/Tk based, not that much nicer.) | |
363 | |
364 So the change is in there. It's the latest change, and is simply on a | |
365 different branch to the other changes. | |
366 | |
367 Almost invariably, you will want to bring everything back together and | |
368 merge the branches. A merge is a change that combines two heads back | |
369 into one. It prepares an updated working directory with the merged | |
370 contents of the two heads for you to review and, if satisfactory, commit. | |
371 | |
372 $ hg merge | |
373 merging pome.txt and poem.txt | |
374 0 files updated, 1 files merged, 0 files removed, 0 files unresolved | |
375 (branch merge, don't forget to commit) | |
376 $ cat poem.txt | |
377 There was a baboon who one afternoon | |
378 said "I think I will fly to the sun". | |
379 So with two great palms strapped to his arms, | |
380 he started his takeoff run. | |
381 $ hg commit -m "Merge first line branch" | |
382 $ | |
383 | |
384 (Footnote: I'm no poet. The poem is, of course, 'Silly Old Baboon' by | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
385 the late, great, Spike Milligan. From 'A Book of Milliganimals', |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
386 Puffin, 1971.) |
0 | 387 |
388 Here's the ASCII art again showing what just happened. Oh, and notice | |
389 that Mercurial has done the right thing with regard to the rename. | |
390 | |
391 $ hg glog | |
392 @ changeset: 5:792ab970fc80 | |
393 |\ tag: tip | |
394 | | parent: 4:267d32f158b3 | |
395 | | parent: 3:a065eb26e6b9 | |
396 | | user: Jim Hague <jim.hague@acm.org> | |
397 | | date: Thu Apr 24 19:29:53 2008 +0100 | |
398 | | summary: Merge first line branch | |
399 | | | |
400 | o changeset: 4:267d32f158b3 | |
401 | | parent: 1:3d65e7a57890 | |
402 | | user: Jim Hague <jim.hague@acm.org> | |
403 | | date: Thu Apr 24 19:13:59 2008 +0100 | |
404 | | summary: Better first two lines | |
405 | | | |
406 o | changeset: 3:a065eb26e6b9 | |
407 | | user: Jim Hague <jim.hague@acm.org> | |
408 | | date: Thu Apr 24 18:52:31 2008 +0100 | |
409 | | summary: Rename my file | |
410 | | | |
411 o | changeset: 2:ff97668b7422 | |
412 |/ user: Jim Hague <jim.hague@acm.org> | |
413 | date: Thu Apr 24 18:50:22 2008 +0100 | |
414 | summary: Finished first verse | |
415 | | |
416 o changeset: 1:3d65e7a57890 | |
417 | user: Jim Hague <jim.hague@acm.org> | |
418 | date: Wed Apr 23 22:49:10 2008 +0100 | |
419 | summary: A great second line | |
420 | | |
421 o changeset: 0:33596ef855c1 | |
422 user: Jim Hague <jim.hague@acm.org> | |
423 date: Wed Apr 23 22:36:33 2008 +0100 | |
424 summary: My Pome | |
425 | |
426 $ | |
427 | |
428 So, our little branch change has now been merged back, and we have a | |
429 single line of development again. Notice that unlike the other | |
430 changesets, changeset 5 has two parent changesets, indicating it is a | |
431 merge changeset. You can only merge two branches in one operation; or | |
432 putting it another way, a changeset can have a maximum of two parents. | |
433 | |
434 This behaviour is absolutely central to Mercurial's philosophy. If a | |
435 change is committed that takes as its starting point a change that | |
436 already has a child, then a branch gets created. Working with | |
437 Mercurial, branches get created frequently, and equally frequently | |
438 merged back. As befits any frequent operation, both are easy to do. | |
439 | |
440 You're probably thinking at this point that this making a commit onto | |
441 an old version is a slightly strange thing to do, and you'd be right. | |
442 But that's exactly what's going to happen the moment you go | |
443 distributed. Two people working independently with their own | |
444 repositories are going to make commits based, typically, on the latest | |
445 changes they happen to have incorporated into their tree. To be | |
446 Distributed, a DVCS has to deal with this. Mercurial faces it head-on. | |
447 When you pull changes into your repo (or someone else pushes them), if | |
448 any of the changes overlap - are both based on the same base change - | |
449 you get extra heads, and it's up to you to let these extra heads live | |
450 or merge, as you please. | |
451 | |
452 In practice this is more manageable then you might think. Consider a | |
453 typical Mercurial usage, where the 'master' repo sits on a known | |
454 server, and everyone pulls changes from the master and pushes their | |
455 own efforts the master. But default Mercurial won't let you push if | |
456 the receiving repo will gain an extra head as a result, so you | |
457 typically pull (and do any required merging) just before | |
458 pushing. Subversion users will recognised this pattern. Subversion | |
459 won't let you commit a change if your working copy is not at the very | |
460 latest revision, so the Subversion user will update, and merge if | |
461 necessary, just before committing. | |
462 | |
463 What, then, about a branch in the conventional sense of '1.0 | |
464 maintenance branch'? Typically in Mercurial you'd handle this by | |
465 keeping a separate cloned repository for those changes. Cloning is | |
466 fast, and if local uses hard links where possible on filesystems that | |
467 support them, so isn't necessarily extravagant on disc space. You can, | |
468 if you prefer, handle them all in a single repo with 'named | |
469 branches', but cloning is definitely simpler. | |
470 | |
471 OK, so now you know the basics of using Mercurial. We can proceed to | |
472 looking at how this magic is achieved. In particular, where does this | |
473 magic globally unique identifier for a change come from? | |
474 | |
475 Inside the Mercurial repo | |
476 ------------------------- | |
477 | |
478 The way Mercurial handles its repo is really quite simple. | |
479 | |
480 That's simple, as in 'most things are simple once you know the | |
481 answer'. I found the explanation helpful, so this section attempts | |
482 the 10,000ft (FL100 if you prefer) view of Mercurial. | |
483 | |
484 (Foornote: Bryan O'Sullivan's excellent Mercurial book has a chapter | |
485 on the subject, and the Mercurial website has a fair amount of detail | |
486 too. This is 'research', OK?) | |
487 | |
488 First remember that any file or component can only have one or two | |
489 parents. You can't merge more than one other branch at once. | |
490 | |
491 We start with the basic building block, which Mercurial calls a | |
492 revlog. A revlog is a thing that holds a file and all the changes in | |
493 the file history. (Footnote: For any non-trivial file, this will | |
494 actually be two files on the disc, a data file and an index). The | |
495 revlog stores the (compressed) differences between successive versions | |
496 of the file, though it will periodically store a complete version of | |
497 the file instead of a difference, so that the content of any | |
498 particular file version can always be reconstructed without excessive | |
499 effort. | |
500 | |
501 Under the secret-squirrel Mercurial .hg directory at the top of your | |
502 project is a store which holds a revlog for each file in your project. | |
503 | |
504 Any point in the evolution of a revlog can be uniquely identified with | |
505 a nodeid. This is simply the SHA1 hash of the current file contents | |
506 concatenated with the nodeids of one or both parents of the current | |
507 revision. Note that this way, two file states are identical if and | |
508 only if the file contents are the same *and* the file has the | |
509 same history. | |
510 | |
511 Here's a dump of a revlog index: | |
512 | |
513 $ hg debugindex .hg/store/data/pome.txt.i | |
514 rev offset length base linkrev nodeid p1 p2 | |
515 0 0 32 0 0 6bbbd5d6cc53 000000000000 000000000000 | |
516 1 32 51 0 1 83d266583303 6bbbd5d6cc53 000000000000 | |
517 2 83 84 0 2 14a54ec34bb6 83d266583303 000000000000 | |
518 3 167 76 3 4 dc4df776b38b 83d266583303 000000000000 | |
519 $ | |
520 | |
521 Note here that a file state can have two parents. If both the parent | |
522 nodeids are non-null, the file state has two parents, and the state is | |
523 therefore the result of a merge. | |
524 | |
525 Let's dump out a revlog at a particular revision: | |
526 | |
527 $ hg debugdata .hg/store/data/pome.txt.i 2 | |
528 There was a gibbon one morning | |
529 said "I think I will fly to the moon". | |
530 So with two great palms strapped to his arms, | |
531 he started his takeoff run. | |
532 $ | |
533 | |
534 The next component is the manifest. This is simply a list of all the | |
535 files in the project, together with their current nodeids. The | |
536 manifest is a file, held in a revlog. The nodeid of the manifest, | |
537 therefore, identifies the project filesystem at a particular point. | |
538 | |
539 $ hg debugdata .hg/store/00manifest.i 5 | |
540 poem.txt5168b1a5e2f44aa4e0f164e170820845183f50c8 | |
541 $ | |
542 | |
543 Finally we have the changeset. This is the atomic collection of | |
544 changes to a repository that leads to a new revision. The changeset | |
545 info includes the nodeid of the corresponding manifest, the timestamp | |
546 and committer ID, a list of changed files and a comment. The changeset | |
547 also includes the nodeid of the parent changeset, or the two parents | |
548 if the change is a merge. The changeset description is held in a | |
549 revlog, the changelog. | |
550 | |
551 $ hg debugdata .hg/store/00changelog.i 5 | |
552 1ccc11b6f7308cc8fa1573c2f3811a4710c91e3e | |
553 Jim Hague <jim.hague@acm.org> | |
554 1209061793 -3600 | |
555 poem.txt | |
556 pome.txt | |
557 | |
558 Merge first line branch | |
559 $ | |
560 | |
561 The nodeid of the changeset, therefore, gives us a globally unique | |
562 identifier for any particular change. Changesets have a | |
563 Subversion-like incrementing change number, but it is peculiar to that | |
564 repository. The nodeid, however, is global. | |
565 | |
566 One more detail remains to complete the picture. How do we get back | |
567 from a particular file change to find the responsible changeset? Each | |
568 revlog change has a linkrev entry that does just this. | |
569 | |
570 So, now we have a repository with a history of the changes applied to | |
571 that repository. Each change has a unique identifier. If we find that | |
572 change in another repository, it means that at the point in the other | |
573 repository we have exactly the same state; the file contents and | |
574 history are identical. | |
575 | |
576 At this point we can see how pulling changes from another repository | |
577 works. Mercurial has to determine which changesets in the source | |
578 repository are missing in the target repository. To do this, for each | |
579 head in the source repo it has to find the most recent change in that | |
580 head that it already present in the target repo, and get any remaining | |
581 changes after that point. These changes are then copied over and | |
582 applied. | |
583 | |
584 The Mercurial revlog format has proved remarkably durable. Over the | |
585 lifetime of Mercurial, there have been just two changes to the file | |
586 format. And one of those (a very recently change at the time of | |
587 writing, yet to appear in a release version) is a very small change to | |
588 filename storage required to deal with Windows-specific issues. |