Mercurial > CVu-Mercurial
annotate Hg.txt @ 4:561edf852797
More detail on file format changes, from post by MM to mercurial.general.
author | Jim Hague <jim.hague@icc-atcsolutions.com> |
---|---|
date | Sun, 21 Dec 2008 21:42:58 +0000 |
parents | 175493e0e457 |
children | 2ec53c0ed5d8 |
rev | line source |
---|---|
0 | 1 Inside a distributed version control system |
2 =========================================== | |
3 | |
4 Grinton Lodge is a Youth Hostel that sits on an exposed hillside just | |
5 above the small hamlet of Grinton in Swaledale, in the Yorkshire Dales | |
6 National Park. A former Victorian shooting lodge, it now welcomes | |
7 walkers and other travellers from around the world. | |
8 | |
9 Tonight, a Wednesday in mid-November, is not one of its busiest | |
10 nights. Kat, the duty staff member, tells me that there is a small | |
11 corporate team-building group in the annex. There's no sign of them at | |
12 present. Otherwise, that portion of the world that has beaten a path | |
13 to the door of this grand building today consists of just me. And Kat | |
14 goes home soon. | |
15 | |
16 The November CVu, removed from its wrappers and read yesterday, lies | |
17 in my bag. Taunting me. Go on, it says, if you've ever going to put | |
18 finger to keyboard in the name of CVu, well, tonight you are out of | |
19 excuses. | |
20 | |
21 Bugger. | |
22 | |
23 Let's look into Mercurial | |
24 ------------------------- | |
25 | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
26 If you're at all interested in version control systems - and any |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
27 software developer not using one daily is a strange beast indeed - |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
28 you'll at least have become vaguely aware in the last few years of the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
29 growing maturity of the latest group of version control systems |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
30 offering funky new stuff. These are the distributed version control |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
31 systems (DVCS). There is more to them than just their headline |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
32 attributes, being able to check history and do checkins while |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
33 disconnected from a central server, but these are damm useful to start |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
34 with. |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
35 |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
36 When I first heard about DVCS, it wasn't immediately obvious to me (to |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
37 put it mildly) how they would work. After years of using a centralised |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
38 version control system, I had rough mental model of what went on. But |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
39 how do you cope without the central server forcing ordering onto the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
40 changes? |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
41 |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
42 Since then I've started using Mercurial. Mercurial is a DVCS. It's one |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
43 of three DVCSs that have gained significant popularity in the last few |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
44 years, the other two being Git and Bazaar. I switched a significant |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
45 work project over to Mercurial (from Subversion) over a year ago, |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
46 because a customer site required on-site work but could not allow |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
47 access back to the company VPN. I chose Mercurial for a variety of |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
48 reasons which I won't bore you with here. If you must know, see the |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
49 box. |
0 | 50 |
51 What I want to do in this article is give you an insight into how a | |
52 DVCS works. OK, so specifically I'm going to be talking about | |
53 Mercurial, but Git and Bazaar attack the problem in a similar way. But | |
54 first I'd better give you some idea of how you use Mercurial. | |
55 | |
56 :::: | |
57 Box: OK, if you must know: | |
58 | |
59 o Implementability. I needed the system to work on Windows, Linux and | |
60 AIX. The latter was not one of the directly supported platforms for | |
61 any of the candidates. Git's implementation uses a horde of | |
62 tools. Bazaar requires only Python, but required Python 2.4 while IBM | |
63 stubbornly still supplies only Python 2.3. Mercurial requires Python | |
64 2.3 or greater, and uses some C for speed. | |
65 | |
2
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
66 o Simplicity. My users used Subversion daily, but did not generally |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
67 have much experience with other VCS. From the command line, |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
68 Mercurial's core operations will be familiar to a Subversion |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
69 user. This is also true of Bazaar, but was less true of Git. Git has |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
70 improved in this matter since then, but a Mr Winder of this parish |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
71 tells me that it's still possible to seriously embarass |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
72 yourself. There was also a lack of Windows support for Git at the |
ee7f1e2c01a6
Expand slightly on simplicity section of 'Why Mercurial'.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
1
diff
changeset
|
73 time. |
0 | 74 |
75 o Speed. Mercurial is fast. In the same ballpark as Git. Bazaar | |
76 wasn't, and although it has improved significantly, has, in my | |
77 estimation, added user complexity in the process, and is still off the | |
78 pace for some operations. | |
79 | |
80 o Documentation. At the time, Bryan O'Sullivan's excellent Mercurial | |
81 book (http://hgbook.red-bean.com) was a clear winner for best | |
82 documentation. | |
83 :::: | |
84 | |
85 The 5 minute Mercurial overview | |
86 ------------------------------- | |
87 | |
88 I think it unlikely that someone possessing the taste and discernment | |
89 to be reading CVu would not be familiar with at least one version | |
90 control system. So, while I want to give you a flavour of what it's | |
91 like to use, I'm not going to hang about. If you'd like a proper | |
92 introduction, or you don't follow something, I thoroughly recommend | |
93 you consult the Mercurial book. | |
94 | |
95 To start using Mercurial to keep track of a project. | |
96 | |
97 $ hg init | |
98 $ | |
99 | |
100 This creates the repository root in the current directory. | |
101 | |
102 Like CVS with its CVS directory and Subversion with its .svn | |
103 directory, Mercurial keeps its private data in a directory. Mercifully | |
104 there is only one of these, in the top level of your project. And | |
105 rather than holding details of where the actual repository is to be | |
106 found, the .hg directory holds the entire repository. | |
107 | |
108 Next you need to specify the files you want Mercurial to track. | |
109 | |
110 $ echo "There was a gibbon one morning" > pome.txt | |
111 $ hg add pome.txt | |
112 $ | |
113 | |
114 As you might expect, this marks the files as to be added. And as you | |
115 might also expect, you need to commit to record the added files in the | |
116 repository. The commit comment can be supplied on the command line; if | |
117 you don't supply a comment, you'll be dropped into an editor to | |
118 provide one. | |
119 | |
120 There is a suggested format for these messages - a one line summary | |
121 followed by any more required detail on following lines. By default | |
122 Mercurial will only display the first line of commit messages when | |
123 listing changes. In these examples I'll stick to terse messages, and | |
124 I'll enter them from the command line. | |
125 | |
126 $ hg commit -m "My Pome" -u "Jim Hague <jim.hague@acm.org>" | |
127 $ | |
128 | |
129 Mercurial records the user making the change as part of the change | |
130 information. It is usual to give your name and email address as I've | |
131 done here. You can imagine, though, that constantly having to repeat | |
132 this is a bit tedious, so you can set a default user name in a | |
133 configuration file. Mercurial keeps global, user and repository | |
134 configurations, and it can go in any of those. | |
135 | |
136 As with Subversion, after further edits you see how your working copy | |
137 differs from the repository. | |
138 | |
139 $ hg status | |
140 M pome.txt | |
141 $ hg diff | |
142 diff -r 33596ef855c1 pome.txt | |
143 --- a/pome.txt Wed Apr 23 22:36:33 2008 +0100 | |
144 +++ b/pome.txt Wed Apr 23 22:48:01 2008 +0100 | |
145 @@ -1,1 +1,2 @@ There was a gibbon one morning | |
146 There was a gibbon one morning | |
147 +said "I think I will fly to the moon". | |
148 $ hg commit -m "A great second line" | |
149 $ | |
150 | |
151 And look through a log of changes. | |
152 | |
153 $ hg log | |
154 changeset: 1:3d65e7a57890 | |
155 tag: tip | |
156 user: Jim Hague <jim.hague@acm.org> | |
157 date: Wed Apr 23 22:49:10 2008 +0100 | |
158 summary: A great second line | |
159 | |
160 changeset: 0:33596ef855c1 | |
161 user: Jim Hague <jim.hague@acm.org> | |
162 date: Wed Apr 23 22:36:33 2008 +0100 | |
163 summary: My Pome | |
164 | |
165 $ | |
166 | |
167 There are some items here that need an explanation. | |
168 | |
169 The changeset identifer is in fact two identifiers separated by a | |
170 colon. The first is the sequence number of the changeset in the | |
171 repository, and is directly comparable to the change number in a | |
172 Subversion repository. The second is a globally unique identifier for | |
173 that change. As the change is copied from one repository to another | |
174 (this is a distributed system, remember, even if we haven't come to | |
175 that bit yet), its sequence number in any particular repository will | |
176 change, but the global identifier will always remain the same. | |
177 | |
178 'tip' is a Mercurial term. It means simply the most recent change. | |
179 | |
180 Want to rename a file? | |
181 | |
182 $ hg mv pome.txt poem.txt | |
183 $ hg status | |
184 A poem.txt | |
185 R pome.txt | |
186 $ hg commit -m "Rename my file" | |
187 $ | |
188 | |
189 (The command to rename a file is actually 'hg rename', but Mercurial | |
190 saves Unix-trained fingers from typing embarrassment.) | |
191 | |
192 At this point you may be wondering about directories. 'hg mkdir' | |
193 perhaps? Well, no. Mercurial only tracks files. To be sure, the | |
194 directory a file occupies is tracked, but effectively only as a | |
195 component of the file name. This has the slightly unexpected result | |
196 that you can't record an empty directory in your repository. | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
197 |
0 | 198 (Footnote: I tripped over this converting a work Subversion |
199 repository. One possibility is to create a placemaker file in the | |
200 directory. In the event I created the directory (which receives build | |
201 products) as part of the build instead.) | |
202 | |
203 Given this, and the status output above that suggests strongly that | |
204 Mercurial treats a rename as a copy followed by a delete, you may be | |
205 worried that Mercurial won't cope at all well with rearranging your | |
206 repository. Relax. Mercurial does store the details of the rename as | |
207 part of the changeset, and copes very well with rearrangements. | |
208 | |
209 (Footnote: The Mercurial designers justify not dealing with | |
210 directories as first class objects by pointing out that provided you | |
211 can correctly move files about in the tree, the other reasons for | |
212 tracking directories are uncommon and do not in their opinion justify | |
213 the considerable added complexity. So far I've found no reason to | |
214 doubt that judgement.) | |
215 | |
216 Want to rewind the working copy to a previous revision? | |
217 | |
218 $ hg update -r 1 | |
219 1 files updated, 0 files merged, 1 files removed, 0 files unresolved | |
220 $ | |
221 | |
222 'hg update' updates the working files. In this case I'm specifying | |
223 that I want to go back to local changeset 1. I could also have typed | |
224 '-r 3d65e7a57890', or even '-r 3d'; when specifying the global change | |
225 identifier you only need to type enough digits to make it unique. | |
226 | |
227 This is all very well, but it's not exactly distributed, is it? | |
228 | |
229 Copy an existing repository: | |
230 | |
231 elsewhere$ hg clone ssh://jim.home.net/Poem Jim-Poem | |
232 updating working directory | |
233 1 files updated, 0 files merged, 0 files removed, 0 files unresolved | |
234 | |
235 (You can access other repositories via the file system, over http or | |
236 over ssh). | |
237 | |
238 elsewhere$ cd Jim-Poem | |
239 elsewhere$ hg log | |
240 changeset: 3:a065eb26e6b9 | |
241 tag: tip | |
242 user: Jim Hague <jim.hague@acm.org> | |
243 date: Thu Apr 24 18:52:31 2008 +0100 | |
244 summary: Rename my file | |
245 | |
246 changeset: 2:ff97668b7422 | |
247 user: Jim Hague <jim.hague@acm.org> | |
248 date: Thu Apr 24 18:50:22 2008 +0100 | |
249 summary: Finished first verse | |
250 | |
251 changeset: 1:3d65e7a57890 | |
252 user: Jim Hague <jim.hague@acm.org> | |
253 date: Wed Apr 23 22:49:10 2008 +0100 | |
254 summary: A great second line | |
255 | |
256 changeset: 0:33596ef855c1 | |
257 user: Jim Hague <jim.hague@acm.org> | |
258 date: Wed Apr 23 22:36:33 2008 +0100 | |
259 summary: My Pome | |
260 | |
261 'hg clone' is aptly named. It creates a new repository that contains | |
262 exactly the same changes as the source repository. You can make a | |
263 clone just by copying your project directory, if you're confident | |
264 nothing else will access it during the copy. 'hg clone' saves you this | |
265 worry, and sets the default push/pull location in the new repo to the | |
266 cloned repo. | |
267 | |
268 From that point, you use 'hg pull' to collect changes from other | |
269 places into your repo (though note it does not by default update your | |
270 working copy), and, as you might guess, 'hg push' shoves your changes | |
271 into a foreign repository. By default these will act on the repository | |
272 you cloned from, but you can specify any other repository. | |
273 | |
274 More on those in a moment. First, though, I want to show you something | |
275 you can't do in Subversion. Start with the repository with 4 changes | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
276 we just cloned. I want to focus on the first couple of lines, so I'll |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
277 wind the working copy back to the point where only those lines exist. |
0 | 278 |
279 $ hg update -r 1 | |
280 1 files updated, 0 files merged, 1 files removed, 0 files unresolved | |
281 | |
282 And make a change. | |
283 | |
284 $ hg diff | |
285 diff -r 3d65e7a57890 pome.txt | |
286 --- a/pome.txt Wed Apr 23 22:49:10 2008 +0100 | |
287 +++ b/pome.txt Thu Apr 24 19:13:14 2008 +0100 | |
288 @@ -1,2 +1,2 @@ There was a gibbon one morning | |
289 -There was a gibbon one morning | |
290 -said "I think I will fly to the moon". | |
291 +There was a baboon who one afternoon | |
292 +said "I think I will fly to the sun". | |
293 $ hg commit -m "Better first two lines" | |
294 $ | |
295 | |
296 The alert among you will have sat up at that. Well done! Yes, there's | |
297 something very worrying. How can I commit a change at an old point? | |
298 If you try this in Subversion, it will complain mightily about your | |
299 file being out of date. But Mercurial just went ahead and did | |
300 something. The Bazaar experts among you will know that in Bazaar, if | |
301 you use 'bzr revert -r' to bring the working copy to a past revision, | |
302 make a change and commit, then your latest version will be the past | |
303 revision plus your change. Perhaps that's what Mercurial did? | |
304 | |
305 No. What Mercurial did is central to Mercurial's view of the | |
306 world. You took your working copy back to an old changeset, and the | |
307 committed a fresh change based at that changeset. Mercurial actually | |
308 did just what you asked it to do, no more and no less. Let's see the | |
309 initial evidence. | |
310 | |
311 $ hg heads | |
312 changeset: 4:267d32f158b3 | |
313 tag: tip | |
314 parent: 1:3d65e7a57890 | |
315 user: Jim Hague <jim.hague@acm.org> | |
316 date: Thu Apr 24 19:13:59 2008 +0100 | |
317 summary: Better first two lines | |
318 | |
319 changeset: 3:a065eb26e6b9 | |
320 user: Jim Hague <jim.hague@acm.org> | |
321 date: Thu Apr 24 18:52:31 2008 +0100 | |
322 summary: Rename my file | |
323 | |
324 $ | |
325 | |
326 Time for some more Mercurial terminology. You can think of a 'head' in | |
327 Mercurial as the most recent change on a branch. In Mercurial, a | |
328 branch is simply what happens when you commit a change that has as its | |
329 parent a change that already has a child. Mercurial has a standard | |
330 extension 'hg glog' which uses some ASCII art to show the current | |
331 state: | |
332 | |
333 $ hg glog | |
334 @ changeset: 4:267d32f158b3 | |
335 | tag: tip | |
336 | parent: 1:3d65e7a57890 | |
337 | user: Jim Hague <jim.hague@acm.org> | |
338 | date: Thu Apr 24 19:13:59 2008 +0100 | |
339 | summary: Better first two lines | |
340 | | |
341 | o changeset: 3:a065eb26e6b9 | |
342 | | user: Jim Hague <jim.hague@acm.org> | |
343 | | date: Thu Apr 24 18:52:31 2008 +0100 | |
344 | | summary: Rename my file | |
345 | | | |
346 | o changeset: 2:ff97668b7422 | |
347 |/ user: Jim Hague <jim.hague@acm.org> | |
348 | date: Thu Apr 24 18:50:22 2008 +0100 | |
349 | summary: Finished first verse | |
350 | | |
351 o changeset: 1:3d65e7a57890 | |
352 | user: Jim Hague <jim.hague@acm.org> | |
353 | date: Wed Apr 23 22:49:10 2008 +0100 | |
354 | summary: A great second line | |
355 | | |
356 o changeset: 0:33596ef855c1 | |
357 user: Jim Hague <jim.hague@acm.org> | |
358 date: Wed Apr 23 22:36:33 2008 +0100 | |
359 summary: My Pome | |
360 | |
361 $ | |
362 | |
363 'hg view' shows a nicer graphical view. (Footnote: Though, being | |
364 Tcl/Tk based, not that much nicer.) | |
365 | |
366 So the change is in there. It's the latest change, and is simply on a | |
367 different branch to the other changes. | |
368 | |
369 Almost invariably, you will want to bring everything back together and | |
370 merge the branches. A merge is a change that combines two heads back | |
371 into one. It prepares an updated working directory with the merged | |
372 contents of the two heads for you to review and, if satisfactory, commit. | |
373 | |
374 $ hg merge | |
375 merging pome.txt and poem.txt | |
376 0 files updated, 1 files merged, 0 files removed, 0 files unresolved | |
377 (branch merge, don't forget to commit) | |
378 $ cat poem.txt | |
379 There was a baboon who one afternoon | |
380 said "I think I will fly to the sun". | |
381 So with two great palms strapped to his arms, | |
382 he started his takeoff run. | |
383 $ hg commit -m "Merge first line branch" | |
384 $ | |
385 | |
386 (Footnote: I'm no poet. The poem is, of course, 'Silly Old Baboon' by | |
1
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
387 the late, great, Spike Milligan. From 'A Book of Milliganimals', |
608947872f72
Add intro and some minor edits.
Jim Hague <jim.hague@acm.org>
parents:
0
diff
changeset
|
388 Puffin, 1971.) |
0 | 389 |
390 Here's the ASCII art again showing what just happened. Oh, and notice | |
3
175493e0e457
Minor edit pointing better to rename handling.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
2
diff
changeset
|
391 in the above that Mercurial has done the right thing with regard to |
175493e0e457
Minor edit pointing better to rename handling.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
2
diff
changeset
|
392 the rename. |
0 | 393 |
394 $ hg glog | |
395 @ changeset: 5:792ab970fc80 | |
396 |\ tag: tip | |
397 | | parent: 4:267d32f158b3 | |
398 | | parent: 3:a065eb26e6b9 | |
399 | | user: Jim Hague <jim.hague@acm.org> | |
400 | | date: Thu Apr 24 19:29:53 2008 +0100 | |
401 | | summary: Merge first line branch | |
402 | | | |
403 | o changeset: 4:267d32f158b3 | |
404 | | parent: 1:3d65e7a57890 | |
405 | | user: Jim Hague <jim.hague@acm.org> | |
406 | | date: Thu Apr 24 19:13:59 2008 +0100 | |
407 | | summary: Better first two lines | |
408 | | | |
409 o | changeset: 3:a065eb26e6b9 | |
410 | | user: Jim Hague <jim.hague@acm.org> | |
411 | | date: Thu Apr 24 18:52:31 2008 +0100 | |
412 | | summary: Rename my file | |
413 | | | |
414 o | changeset: 2:ff97668b7422 | |
415 |/ user: Jim Hague <jim.hague@acm.org> | |
416 | date: Thu Apr 24 18:50:22 2008 +0100 | |
417 | summary: Finished first verse | |
418 | | |
419 o changeset: 1:3d65e7a57890 | |
420 | user: Jim Hague <jim.hague@acm.org> | |
421 | date: Wed Apr 23 22:49:10 2008 +0100 | |
422 | summary: A great second line | |
423 | | |
424 o changeset: 0:33596ef855c1 | |
425 user: Jim Hague <jim.hague@acm.org> | |
426 date: Wed Apr 23 22:36:33 2008 +0100 | |
427 summary: My Pome | |
428 | |
429 $ | |
430 | |
431 So, our little branch change has now been merged back, and we have a | |
432 single line of development again. Notice that unlike the other | |
433 changesets, changeset 5 has two parent changesets, indicating it is a | |
434 merge changeset. You can only merge two branches in one operation; or | |
435 putting it another way, a changeset can have a maximum of two parents. | |
436 | |
437 This behaviour is absolutely central to Mercurial's philosophy. If a | |
438 change is committed that takes as its starting point a change that | |
439 already has a child, then a branch gets created. Working with | |
440 Mercurial, branches get created frequently, and equally frequently | |
441 merged back. As befits any frequent operation, both are easy to do. | |
442 | |
443 You're probably thinking at this point that this making a commit onto | |
444 an old version is a slightly strange thing to do, and you'd be right. | |
445 But that's exactly what's going to happen the moment you go | |
446 distributed. Two people working independently with their own | |
447 repositories are going to make commits based, typically, on the latest | |
448 changes they happen to have incorporated into their tree. To be | |
449 Distributed, a DVCS has to deal with this. Mercurial faces it head-on. | |
450 When you pull changes into your repo (or someone else pushes them), if | |
451 any of the changes overlap - are both based on the same base change - | |
452 you get extra heads, and it's up to you to let these extra heads live | |
453 or merge, as you please. | |
454 | |
455 In practice this is more manageable then you might think. Consider a | |
456 typical Mercurial usage, where the 'master' repo sits on a known | |
457 server, and everyone pulls changes from the master and pushes their | |
458 own efforts the master. But default Mercurial won't let you push if | |
459 the receiving repo will gain an extra head as a result, so you | |
460 typically pull (and do any required merging) just before | |
461 pushing. Subversion users will recognised this pattern. Subversion | |
462 won't let you commit a change if your working copy is not at the very | |
463 latest revision, so the Subversion user will update, and merge if | |
464 necessary, just before committing. | |
465 | |
466 What, then, about a branch in the conventional sense of '1.0 | |
467 maintenance branch'? Typically in Mercurial you'd handle this by | |
468 keeping a separate cloned repository for those changes. Cloning is | |
469 fast, and if local uses hard links where possible on filesystems that | |
470 support them, so isn't necessarily extravagant on disc space. You can, | |
471 if you prefer, handle them all in a single repo with 'named | |
472 branches', but cloning is definitely simpler. | |
473 | |
474 OK, so now you know the basics of using Mercurial. We can proceed to | |
475 looking at how this magic is achieved. In particular, where does this | |
476 magic globally unique identifier for a change come from? | |
477 | |
478 Inside the Mercurial repo | |
479 ------------------------- | |
480 | |
481 The way Mercurial handles its repo is really quite simple. | |
482 | |
483 That's simple, as in 'most things are simple once you know the | |
484 answer'. I found the explanation helpful, so this section attempts | |
485 the 10,000ft (FL100 if you prefer) view of Mercurial. | |
486 | |
487 (Foornote: Bryan O'Sullivan's excellent Mercurial book has a chapter | |
488 on the subject, and the Mercurial website has a fair amount of detail | |
489 too. This is 'research', OK?) | |
490 | |
491 First remember that any file or component can only have one or two | |
492 parents. You can't merge more than one other branch at once. | |
493 | |
494 We start with the basic building block, which Mercurial calls a | |
495 revlog. A revlog is a thing that holds a file and all the changes in | |
496 the file history. (Footnote: For any non-trivial file, this will | |
497 actually be two files on the disc, a data file and an index). The | |
498 revlog stores the (compressed) differences between successive versions | |
499 of the file, though it will periodically store a complete version of | |
500 the file instead of a difference, so that the content of any | |
501 particular file version can always be reconstructed without excessive | |
502 effort. | |
503 | |
504 Under the secret-squirrel Mercurial .hg directory at the top of your | |
505 project is a store which holds a revlog for each file in your project. | |
506 | |
507 Any point in the evolution of a revlog can be uniquely identified with | |
508 a nodeid. This is simply the SHA1 hash of the current file contents | |
509 concatenated with the nodeids of one or both parents of the current | |
510 revision. Note that this way, two file states are identical if and | |
511 only if the file contents are the same *and* the file has the | |
512 same history. | |
513 | |
514 Here's a dump of a revlog index: | |
515 | |
516 $ hg debugindex .hg/store/data/pome.txt.i | |
517 rev offset length base linkrev nodeid p1 p2 | |
518 0 0 32 0 0 6bbbd5d6cc53 000000000000 000000000000 | |
519 1 32 51 0 1 83d266583303 6bbbd5d6cc53 000000000000 | |
520 2 83 84 0 2 14a54ec34bb6 83d266583303 000000000000 | |
521 3 167 76 3 4 dc4df776b38b 83d266583303 000000000000 | |
522 $ | |
523 | |
524 Note here that a file state can have two parents. If both the parent | |
525 nodeids are non-null, the file state has two parents, and the state is | |
526 therefore the result of a merge. | |
527 | |
528 Let's dump out a revlog at a particular revision: | |
529 | |
530 $ hg debugdata .hg/store/data/pome.txt.i 2 | |
531 There was a gibbon one morning | |
532 said "I think I will fly to the moon". | |
533 So with two great palms strapped to his arms, | |
534 he started his takeoff run. | |
535 $ | |
536 | |
537 The next component is the manifest. This is simply a list of all the | |
538 files in the project, together with their current nodeids. The | |
539 manifest is a file, held in a revlog. The nodeid of the manifest, | |
540 therefore, identifies the project filesystem at a particular point. | |
541 | |
542 $ hg debugdata .hg/store/00manifest.i 5 | |
543 poem.txt5168b1a5e2f44aa4e0f164e170820845183f50c8 | |
544 $ | |
545 | |
546 Finally we have the changeset. This is the atomic collection of | |
547 changes to a repository that leads to a new revision. The changeset | |
548 info includes the nodeid of the corresponding manifest, the timestamp | |
549 and committer ID, a list of changed files and a comment. The changeset | |
550 also includes the nodeid of the parent changeset, or the two parents | |
551 if the change is a merge. The changeset description is held in a | |
552 revlog, the changelog. | |
553 | |
554 $ hg debugdata .hg/store/00changelog.i 5 | |
555 1ccc11b6f7308cc8fa1573c2f3811a4710c91e3e | |
556 Jim Hague <jim.hague@acm.org> | |
557 1209061793 -3600 | |
558 poem.txt | |
559 pome.txt | |
560 | |
561 Merge first line branch | |
562 $ | |
563 | |
564 The nodeid of the changeset, therefore, gives us a globally unique | |
565 identifier for any particular change. Changesets have a | |
566 Subversion-like incrementing change number, but it is peculiar to that | |
567 repository. The nodeid, however, is global. | |
568 | |
569 One more detail remains to complete the picture. How do we get back | |
570 from a particular file change to find the responsible changeset? Each | |
571 revlog change has a linkrev entry that does just this. | |
572 | |
573 So, now we have a repository with a history of the changes applied to | |
574 that repository. Each change has a unique identifier. If we find that | |
575 change in another repository, it means that at the point in the other | |
576 repository we have exactly the same state; the file contents and | |
577 history are identical. | |
578 | |
579 At this point we can see how pulling changes from another repository | |
580 works. Mercurial has to determine which changesets in the source | |
581 repository are missing in the target repository. To do this, for each | |
582 head in the source repo it has to find the most recent change in that | |
583 head that it already present in the target repo, and get any remaining | |
584 changes after that point. These changes are then copied over and | |
585 applied. | |
586 | |
4
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
587 The Mercurial revlog format has proved remarkably durable. Since the |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
588 first release of Mercurial in April 2005, these have been a total of 5 |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
589 changes to the file format. However, of those, all but one have been |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
590 changes to the handling of file names. The most recent change, in |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
591 October 2008, and its predecessor in December 2006, were both |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
592 introduced purely to cope with Windows specific issues. The one change |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
593 that touched the datastructures described above was in April 2006. The |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
594 format introduced, RevLogNG, changed only the details of index data |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
595 held, not the overall design. The chief Mercurial developer, Matt |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
596 Mackall, notes that the code in present-day Mercurial devoted to |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
597 reading the old format comprises 28 lines of Python. Compared with, |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
598 say, the early tribulations of Subversion and the switch from bdfs to |
561edf852797
More detail on file format changes, from post by MM to mercurial.general.
Jim Hague <jim.hague@icc-atcsolutions.com>
parents:
3
diff
changeset
|
599 fsfs, this is an impressive record. |