But how is that supposed to work? What happens when you make some changes to a file and save it? Do you want the "git file system" to commit it right aways or wait until you to issue a "commit" command? The first behavior would obviously be wrong, and the second would make the "file system" not operationally transparent anyways. Right? By the way, the only SCM I have worked with that tries to mount its repository (or a view on top of it) as a file system is ClearCase with its dynamic views. And, between the buggy file system implementation, the intrusion on workflow, and the lack of scalability, at least in the organization I worked for, it turned out to be a horrible, horrible, horrible idea. Cheers. -- Jing Xue -
Not sure what you mean by operationally transparent? It would be transparent for the updating client, and the rest of the git-users would need to wait for the commit from the updating client; which is ok, as this transparency is not meant to change the server-side git-update semantic. Judging an idea, based on a flawed implementation, doesn't prove that the idea itself is flawed. You could probably do that, or you could instead use cp -al. Both would Sure, you wouldn't want to change the git-engine update semantics, as that sits on the server and handles all users. But what the git model is currently missing is a client manager. Right now, this is being worked around by replicating the git tree on the client, which still doesn't provide the required transparency. IOW, git currently only implements the server-side use-case, but fails to deliver on the client-side. By introducing a git-client manager that handles the transparency needs of a single user, it should be possible to clearly isolate update semantics for both the client and the server, each handling their specific use-case. Thanks! -- Al --
It isn't the implementation that is flawed, it is the idea. The entire point of a change control system is that you explicitly define change sets and add comments to the set. The filesystem was designed to allow changes to be made willy-nilly. If your goal is to perform change control only with filesystem semantics, then you have a non starter as their goals are opposing. Requiring an explicit command command is hardly burdensome, and otherwise, a git tree is perfectly transparent to It isn't missing a client manager, it was explicitly designed to not have one, at least not as a distinct entity from a server, because it does not use a client/server architecture. This is very much by design, not a work around. What transparency are you requiring here? You can transparently read your git tree with all non git aware tools, what other meaning of Any talk of client or server makes no sense since git does not use a client/server model. If you wish to use a centralized repository, then git can be set up to transparently push/pull to/from said repository if you wish via hooks or cron jobs. --
Whether git uses the client/server model or not does not matter; what matters is that there are two distinct use-cases at work here: one on the Again, this only handles the interface to/from the server/repository, but once you pulled the sources, it leaves you without Version Control on the client. By pulling the sources into a git-client manager mounted on some dir, it should be possible to let the developer work naturally/transparently in a readable/writeable manner, and only require his input when reverting locally or committing to the server/repository. Thanks! -- Al --
Git is distributed. The repository is everywhere. No server is actually needed. No, that's CVS, SVN and other centralized scm's. With git you have perfect version control on each peer. That's the entire idea behind "fully How is that different from what every SCM, including git, is doing today? The user needs to tell the scm when it's time to take a snapshot of the current state. Git is distributed though, so committing is usually not the same as publishing. Is that lack of a single command to commit and publish what's nagging you? If it's not, I completely fail to see what you're getting at, unless you've only ever looked at repositories without a worktree attached, or you think that git should work like an editor's "undo" functionality, which would be quite insane. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 --
When you read server, don't read it as localized; a server can be distributed. What distinguishes a server from an engine is that it has to handle a multi-user use-case. How that is implemented, locally or remotely As explained before in this thread, replicating the git tree on the client You need to re-read the thread. Thanks! -- Al --
Hi, I don't know why you write that, and then say thanks. Clearly, what you wrote originally, and what Andreas pointed out, were quite obvious indicators that git already does what you suggest. You _do_ work "transparently" (whatever you understand by that overused term) in the working directory, unimpeded by git. And whenever it is time to revert or commit, you cry for help, invoking git. So either you succeeded in making yourself misunderstood, or Andreas had quite the obvious and correct comment for you. Not that diffcult, Dscho --
If you go back in the thread, you may find a link to a gitfs client that somebody kindly posted. This client pretty much defines the transparency I'm talking about. The only problem is that it's read-only. To make it really useful, it has to support versioning locally, disconnected from the server repository. One way to implement this, could be by committing every update unconditionally to an on-the-fly created git repository private to the gitfs client. With this transparently created private scratch repository it should then be possible for the same gitfs to re-expose the locally created commits, all without any direct user-intervention. Later, this same scratch repository could then be managed by the normal git-management tools/commands to ultimately update the backend git repositories. BTW: Sorry for my previous posts that contained the wrong date; it seems that hibernation sometimes advances the date by a full 24h. Has anybody noticed this as well? Thanks! -- Al --
Earlier you said that you need to be able to tell git when you want to make a commit, which means pretty much any old filesystem could serve as gitfs. Now you're saying you want every single update to be committed, which would make it mimic an editor's undo functionality. I still don't get what it is That's exactly what's happening today. I imagine whoever wrote the gitfs thing did so to facilitate testing, or as some form of intellectual masturbation. So, to get to the bottom of this, which of the following workflows is it you want git to support? ### WORKFLOW A ### edit, edit, edit edit, edit, edit edit, edit, edit Oops I made a mistake and need to hop back to "current - 12". edit, edit, edit edit, edit, edit publish everything, similar to just tarring up your workdir and sending out ### END WORKFLOW A ### ### WORKFLOW B ### edit, edit, edit ok this looks good, I want to save a checkpoint here edit, edit, edit looks good again. next checkpoint edit, edit, edit oh crap, back to checkpoint 2 edit, edit, edit ooh, that's better. save a checkpoint and publish those checkpoints ### END WORKFLOW B ### If you could just answer that question and stop writing "transparent" or any synonym thereof six times in each email, we can possibly help you. As it stands now though, nobody is very interested because you haven't explained how you want this "transparency" of yours to work in an every day scenario. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 --
### WORKFLOW C ### for every save on a gitfs mounted dir, do an implied checkpoint, commit, or publish (should be adjustable), on its privately created on-the-fly repository. ### END WORKFLOW C ### For example: echo "// last comment on this file" >> /gitfs.mounted/file should do an implied checkpoint, and make these checkpoints immediately visible under some checkpoint branch of the gitfs mounted dir. Note, this way the developer gets version control without even noticing, and works completely transparent to any kind of application. Thanks! -- Al --
It looks like it is WORKFLOW A (with the fact that each ',' is file Why not use versioning filesystem for that, for example ext3cow (which looks suprisingly git-like, when you take into account that for ext3cow history is linear and centralized, so one can use date or sequential number to name commits). See GitLinks page on Git Wiki, "Other links" section: http://www.ext3cow.com/ Version control system is all about WORKFLOW B, where programmer controls when it is time to commit (and in private repository he/she can then rewrite history to arrive at "Perfect patch series"[*1*]); something that for example CVS failed at, requiring programmer to do a merge if upstream has any changes when trying to commit. [*1*] I have lost link to post at LKML about rewriting history to arrive at perfect patch _series_. IIRC I have found it first time on this mailing list. I would be grateful for sending this link if you have it. TIA. -- Jakub Narebski ShadeHawk on #git --
Sure, Linus mentioned the cow idea before in this thread, but you would still Because WORKFLOW C is transparent, it won't affect other workflows. So you could still use your normal WORKFLOW B in addition to WORKFLOW C, gaining an additional level of version control detail at no extra cost other than the git-engine scratch repository overhead. BTW, is git efficient enough to handle WORKFLOW C? Thanks! -- Al --
Imagine the number of commits a 'make clean; make' will do in a kernel tree, as it commits all those .o files... :)
My guess is that Al is not really a developer (product management/ marketing?), what he has in mind is probably not an SCM but a backup system a la Mac's time machine or Netapp's snapshots that also support disconnected commits. I think that git could be a suitable engine for such systems, after a few tweaks to avoid compressing already compressed blobs like jpeg, mp3 and mpeg etc. __Luke --
.o files??? It probably goes without saying, that gitfs should have some basic configuration file to setup its transparent behaviour, and which would most probably contain an include / exclude file-filter mask, and probably other basic configuration options. But this is really secondary to the implementation, and the question remains whether git is efficient enough. IOW, how big is the git commit overhead as compared to a normal copy? Thanks! -- Al --
But then it's not *truly* transparent, is it? And that leaves another question - if you make a config file that excludes all the .o files - then what's backing the .o files? Those data blocks need to be *someplace*. Maybe you can do something ugly like use unionfs to combine your gitfs with something else to store the other files... But at that point, you're probably better off just creating a properly designed versioning filesystem.
Don't mistake transparency with some form of auto-heuristic. Transparency only means that it inserts functionality without impeding your normal But gitfs is not about designing a versioning filesystem, it's about designing a transparent interface into git to handle an SCM use-case. Thanks! -- Al --
Hi, The question is not if git is efficient enough to handle workflow C, but if that worflow is efficient enough to help anybody. Guess what takes me the longest time when committing? The commit message. But it is really helpful, so there is a _point_ in writing one, and there is a _point_ in committing when I do it: it is a point in time where I expect the tree to be in a good shape, to be compilable, and to solve a specific problem which I describe in the commit message. So I absolutely hate this "transparency". Git _is_ transparent; it does not affect any of my other tools; they still work very well thankyouverymuch. What your version of "transparency" would do: destroy bisectability, make an absolute gibberish of the history, and more! Nobody could read the output of "git log" and form an understanding what was done. Nobody could read the commit message for a certain "git blame"d line that she tries to make sense of. IOW you would revert the whole meaning of the term Source Code Management. Hth, Dscho --
So you *do* want an editor's undo function, but for an entire filesystem. That's a handy thing to have every now and then, but it's not what git One other thing that's fairly important to note is that this can never ever handle changesets, since each write() of each file will be a commit on its own. It's so far from what git does that I think you'd be better off just implementing it from scratch, or looking at a versioned fs, like Jakub suggested in his reply. You're also neglecting one very important aspect of what an SCM provides if you go down this road, namely project history. You basically have two choices with this "implicit save on each edit": * force the user to supply a commit message for each and every edit * ignore commit messages altogether Obviously, forcing a commit message each time is the only way to get some sort of proper history to look at after it's done, but it's also such an appalling nuisance that I doubt *anyone* will actually like that, and since changesets aren't supported, you'll have "implement xniz api, commit 1 of X" messages. Cumbersome, stupid, and not very useful. Ignoring commit messages altogether means you ignore the entire history, and the SCM then becomes a filesystem-wide "undo" cache. This could ofcourse work, but it's something akin to building a nuclear powerplant to power a single lightbulb. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 --
so if you have a script that does echo "mail header" >tmpfile echo "subject: >>tmpfile echo >>tmpfile echo "body" >>tmpfile you want to have four seperate commits what if you have a perl script open outfile ">tmpfile"; print outfile "mail header\n"; print outfile "subject:\n\n"; print outfile "body\n"; close ourfile; how many seperate commits do you think should take place? what if $|=1 (unbuffered output, so that each print statement becomes visable to other programs immediatly)? what if the file is changed via mmap? should each byte/word written to memory be a commit? or when the mmap is closed? or when the kernel happens to flush the page to disk? 'recording every change to a filesystem' is a very incomplete definition of a goal. David Lang --
Ouch... That looks worse than "plain" per-file versioning. Not only do you per definition get "broken" commits if there's a change that affects two dependent files, you also get an insane amount of commits just for testing stuff, or fixing bugs. And unless you use some kind of union-fs on top (or keep ignored files in special unversioned area in your gitfs, which seems somewhat ugly), you'll probably also have to track lots of files in the working directory that are generated, unless you want to re-generate them after each reboot. And that leads to even more absolutely useless revisions. Just thinking of my vim .swp files (which I definitely don't want to loose on a crash/power outtage/pkill -9 .<ENTER> dammit) makes me scream because of the gazillion of commits they will produce (and no, I don't want them in some special out of tree directory). Plus, I have vim setup to _replace_ files on write, so that I can more easily use hard-linked copies with changing all copies at once _unless_ I explicitly want to, meaning that I'd get full remove/add commits, which are absolutely useless. And trying to detect such patterns (rename, then write the changed file with the old name and then delete the renamed file) is probably not worth the trouble, because you coincidently might _want_ to have just these three steps recorded when you happen to perform them manually. And if you go for heuristics, you'll complain each time you get a false-positive/negative. That said, out of pure curiousness I came up with the attached script which just uses inotifywait to watch a directory and issue git commands on certain events. It is extremely stupid, but seems to work. And at least it hasn't got the drawbacks of a real gitfs regarding the need to have a "separate" non-versioned storage area for the working directory, because it simply uses the existing working directory wherever that might be stored. It doesn't use GIT_DIR/WORK_DIR yet, but hey, should be easy to add... Feel free to mess ...
It has been pointed out to you that it DOES. Either that or nobody else understands your nebulous use of "transparency" so maybe you should define it like we've been asking you. Furthermore, the comment you replied to said nothing about transparency, nor did your comment it was in reply to; rather it was pointing out the fact that your statement that the git can not perform version control on the client is patently Perhaps you should. We have been trying to get you to explain how you think git isn't "transparent" while at the same time pointing out how we think it is. You have failed to demonstrate any evidence to back up your claims, all of which have been shown to be false. --
I guess what he means is that when your write to the file -- from your editor -- it can't be considered a commit. During an editing session you might write a dozen times, only to commit it once you are happy If you want a dumb-ish client CVS-style, you can try git-cvsserver. But the git model is definitely superior -- "replicating the tree on the client" is not a workaround but a central strategy. Have you used git and other DSCMs much? From your writing, it sounds like you may have misunderstood how some of the principles of git work out in practice. cheers, m --
