Discussion:
[Gnucash-changes] Spruce up the delete window dialog to make it more HIG compliant.
(too old to reply)
Neil Williams
2005-07-20 13:42:10 UTC
Permalink
+ /*
+ * *** THIS DIALOG IS NOT HIG COMPLIANT. ***
+ *
+ * According to the HIG, the secondary context should include
+ * context about the number of changes that will be lost (either in
+ * time or a count). While it is possible to simply provide the
+ * time since the last save, that doesn't appear too usefule. If
+ * the user has had Gnucash open for hours in the background, but
+ * only made a change in the last few minutes, then telling them
+ * they will lose hours work of work is wring. The QOF code needs
+ * to be modified to provide better timing information. The best
+ * case scenario would be if QOF could provide a timestamp of the
+ * oldest unsaved change.
+ */
The SQL backend uses qof_instance_get_last_update but this dialog would
require iterating over every instance in the book, one type at a time prior
to displaying the dialog, then sorting the timespecs (and this when the user
is waiting for the app to close!)

The alternative method would involve keeping tabs on all instances and sorting
the various update times but then that structure would need to be stored
outside the library anyway (couldn't be a static).

Do other applications interpret "Time Period" in the above manner?

The HIG only specifies:
"The secondary text provides the user with some context about the number of
changes that might be unsaved."

IMHO, "some context" does not mean "number of seconds since the last change,
anywhere in a 2Mb book". The iterations themselves could delay the display of
the dialog. I don't see how this is a job for the QOF library.

How many people really do leave GnuCash running in the background?

http://developer.gnome.org/projects/gup/hig/1.0/windows.html#alerts-confirmation

The example itself only shows a simple time period, no hint that this has to
be the time period since the last modification or the last save.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
David Hampton
2005-07-20 14:52:41 UTC
Permalink
Post by Neil Williams
+ /*
+ * *** THIS DIALOG IS NOT HIG COMPLIANT. ***
+ *
+ * According to the HIG, the secondary context should include
+ * context about the number of changes that will be lost (either in
+ * time or a count). While it is possible to simply provide the
+ * time since the last save, that doesn't appear too usefule. If
+ * the user has had Gnucash open for hours in the background, but
+ * only made a change in the last few minutes, then telling them
+ * they will lose hours work of work is wring. The QOF code needs
+ * to be modified to provide better timing information. The best
+ * case scenario would be if QOF could provide a timestamp of the
+ * oldest unsaved change.
+ */
The SQL backend uses qof_instance_get_last_update but this dialog would
require iterating over every instance in the book, one type at a time prior
to displaying the dialog, then sorting the timespecs (and this when the user
is waiting for the app to close!)
O.K. I'm confused. I thought the point of switching to SQL was that the
data was always written through to the database, so that if gnucash
crashed at any point there would be no lost work.
Post by Neil Williams
The alternative method would involve keeping tabs on all instances and sorting
the various update times but then that structure would need to be stored
outside the library anyway (couldn't be a static).
How hard is it to keep track of the first time this session you issued
an UPDATE or DELETE command? I don't need to know what it was, only
when it was. Isn't the SQL code contained in a small set of files?
Post by Neil Williams
Do other applications interpret "Time Period" in the above manner?
I don't know. I would love to see "... your 5 changes from the past 1
hour and 30 minutes will be discarded" but I figured that was asking for
too much.
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
With a nice big picture showing time in minutes.
Post by Neil Williams
IMHO, "some context" does not mean "number of seconds since the last change,
anywhere in a 2Mb book". The iterations themselves could delay the display of
the dialog. I don't see how this is a job for the QOF library.
I don't hear an alternative context proposal here. What sort of context
do you propose should be provided to users in place of a time?

QOF is already keeps dirty flags for every item in knows about. How
hard would it be to change the code that marks objects as modified from

dirty = TRUE

to:

if (dirty == 0)
dirty = time();

When you try and close the main gnucash window, QOF already iterates
over the entire set of objects. It must, in order to determine if a
save dialog is needed. The best case today is that the first object is
modified and the iteration can bail. The worst case today is that every
object has to be inspected to discover that it has not been modified.
This is today in every extant version of Gnucash. The dialog is
different, but the code path to see if the dialog is needed is the same.
In order to provide a time in the context dialog, a boolean test would
be replaced with a test for a non-zero time (still a single instruction)
and then a possible time comparison. Yes, the entire object tree would
have to be run each time, but is that really a big deal? Have you
noticed a long delay when quitting gnucash when you haven't changed
anything? The additional time comparison code would only executed for
each objects that is modified.
Post by Neil Williams
How many people really do leave GnuCash running in the background?
I have on occasion.
Post by Neil Williams
http://developer.gnome.org/projects/gup/hig/1.0/windows.html#alerts-confirmation
The example itself only shows a simple time period, no hint that this has to
be the time period since the last modification or the last save.
Agreed, but who cares about the time period since the last save. I
could care less that I haven't saved in 25 days if there aren't any
changes to the file. If there are changes to the file, I would like to
know that the change was 42 minutes ago, not wonder what I changed in
the last 25 days.

David
Derek Atkins
2005-07-20 17:10:07 UTC
Permalink
Post by David Hampton
O.K. I'm confused. I thought the point of switching to SQL was that the
data was always written through to the database, so that if gnucash
crashed at any point there would be no lost work.
You're not confused. That is the point of moving the SQL. Neil
was just pointing out that the SQL code can do what you're asking
fairly easily.

[snip]
Post by David Hampton
QOF is already keeps dirty flags for every item in knows about. How
hard would it be to change the code that marks objects as modified from
dirty = TRUE
if (dirty == 0)
dirty = time();
When you try and close the main gnucash window, QOF already iterates
over the entire set of objects. It must, in order to determine if a
save dialog is needed. The best case today is that the first object is
modified and the iteration can bail. The worst case today is that every
object has to be inspected to discover that it has not been modified.
This is today in every extant version of Gnucash. The dialog is
different, but the code path to see if the dialog is needed is the same.
In order to provide a time in the context dialog, a boolean test would
be replaced with a test for a non-zero time (still a single instruction)
and then a possible time comparison. Yes, the entire object tree would
have to be run each time, but is that really a big deal? Have you
noticed a long delay when quitting gnucash when you haven't changed
anything? The additional time comparison code would only executed for
each objects that is modified.
I think something like this would work fine. We really should have
a single interface that we can call to mark an object/instance as
dirty. That way we can do it in one way and inherit changes, so we
don't need to reimplement it in each object.

This might best go hand-in-hand with an update to the event system.
In particular I want to add more events so we can differentiate
between an account being modified and a new transaction entering an
account (for example).

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Neil Williams
2005-07-20 18:28:00 UTC
Permalink
Post by Derek Atkins
We really should have
a single interface that we can call to mark an object/instance as
dirty.
OK, all I need now is the time to replace every obj->inst.dirty = TRUE with
qof_instance_set_dirty((QofInstance*)obj)
:-(

void qof_instance_set_dirty(QofInstance* inst)
{
QofBook *book;
QofCollection *coll;
QofEntity *ent;

inst->dirty = TRUE;
ent = inst->entity;
book = qof_instance_get_book(inst);
coll = qof_book_get_collection(book, ent->e_type);
qof_collection_mark_dirty(coll);
}

Then replace every is_dirty = NULL with
is_dirty = qof_collection_is_dirty

Yes?

I'm not sure about the time handling as yet.

My TODO list is getting ever longer . . . .
Post by Derek Atkins
That way we can do it in one way and inherit changes, so we
don't need to reimplement it in each object.
I think we'd need the is_dirty prototype, no?
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Neil Williams
2005-07-20 18:53:36 UTC
Permalink
Post by Neil Williams
OK, all I need now is the time to replace every obj->inst.dirty = TRUE with
qof_instance_set_dirty((QofInstance*)obj)
:-(
void qof_instance_set_dirty(QofInstance* inst)
{
QofBook *book;
QofCollection *coll;
QofEntity *ent;
inst->dirty = TRUE;
ent = inst->entity;
ent = &inst->entity; /* :-) */
Post by Neil Williams
book = qof_instance_get_book(inst);
coll = qof_book_get_collection(book, ent->e_type);
qof_collection_mark_dirty(coll);
}
That function will be in my next commit (which isn't due particularly soon).
Post by Neil Williams
Then replace every is_dirty = NULL with
is_dirty = qof_collection_is_dirty
and make_clean = NULL with make_clean = qof_collection_make_clean

When a collection is marked clean, does THAT have to iterate down to every
instance? So far, I've avoided anything that requires iteration over every
single instance for this task. It's only the collection that is actually
monitored for being dirty. It seems pointless to check every instance when
only one instance may have caused the collection to be marked as dirty. It's
very quick to code this in but it could add a noticeable delay on Save.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Neil Williams
2005-07-20 19:03:21 UTC
Permalink
Post by Neil Williams
When a collection is marked clean, does THAT have to iterate down to every
instance?
Ignore that, I'll make it so that if the collection is clean,
qof_instance_is_dirty returns (and sets) FALSE.

...
if(qof_collection_is_dirty(coll)) { return inst->dirty; }
inst->dirty = FALSE;
return FALSE;

i.e. don't shortcircuit things by setting inst->dirty direct!

(As part of the CLI, I'll be reviewing the private headers exported by QOF
soon and qofinstance-p.h will be removed from EXTRA_DIST which should prevent
such shortcuts in the future. qofid-p.h will also be removed.)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
David Hampton
2005-07-20 19:52:47 UTC
Permalink
Post by Neil Williams
(As part of the CLI, I'll be reviewing the private headers exported by QOF
soon and qofinstance-p.h will be removed from EXTRA_DIST which should prevent
such shortcuts in the future. qofid-p.h will also be removed.)
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
Removing any header files from this will cause 'make distcheck' to fail,
not that that would be anything new.

David
Derek Atkins
2005-07-20 20:02:07 UTC
Permalink
Post by Neil Williams
Post by Neil Williams
(As part of the CLI, I'll be reviewing the private headers exported by QOF
soon and qofinstance-p.h will be removed from EXTRA_DIST which should
prevent
Post by Neil Williams
such shortcuts in the future. qofid-p.h will also be removed.)
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
Removing any header files from this will cause 'make distcheck' to fail,
not that that would be anything new.
Correct, all source files (including headers) must be in a _SOURCES, _HEADERS,
or EXTRA_DIST. You cannot remove them from all of those. A header need not be
installed, but it must be in the dist or building from the tarball will fail.

Neil: I would ask that you check with us before making any build-system changes
like this :)

You're welcome to remove the -p.h files from the install rules, but removing
from EXTRA_DIST is wrong.
Post by Neil Williams
David
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Christian Stimming
2005-07-20 20:22:05 UTC
Permalink
Post by Derek Atkins
Post by David Hampton
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
Removing any header files from this will cause 'make distcheck' to fail,
not that that would be anything new.
Correct, all source files (including headers) must be in a _SOURCES,
_HEADERS, or EXTRA_DIST. You cannot remove them from all of those. A
header need not be installed, but it must be in the dist or building from
the tarball will fail.
Actually I would suggest adding those not-to-be-installed-headers to the
variable noinst_HEADERS. That should be the same effect, while at the same
time makes it clear that these files are headers, which are always in dist...
whatever...

Christian
Neil Williams
2005-07-20 20:26:48 UTC
Permalink
Post by Derek Atkins
Post by David Hampton
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
Oops.
Post by Derek Atkins
Post by David Hampton
Removing any header files from this will cause 'make distcheck' to fail,
not that that would be anything new.
Correct, all source files (including headers) must be in a _SOURCES,
_HEADERS, or EXTRA_DIST. You cannot remove them from all of those.
No, I wouldn't do that. I just would make sure that these private headers
aren't installed and the functions in them are not exported.

noinst_HEADERS is what I meant. Sorry.

(Should have checked before I sent it - what I meant was not what I typed!)
Post by Derek Atkins
A
header need not be installed, but it must be in the dist or building from
the tarball will fail.
Yes, sorry. Installation is what I meant.
Post by Derek Atkins
Neil: I would ask that you check with us before making any build-system
changes like this :)
:-)

(Do you remember when the devel list was so quiet someone sent a ping message?
That was when I was on holiday / limited email access. Funny how those two
events coincided!)
:-)))
Post by Derek Atkins
You're welcome to remove the -p.h files from the install rules, but
removing from EXTRA_DIST is wrong.
Agreed.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Neil Williams
2005-07-20 20:16:27 UTC
Permalink
Post by David Hampton
Post by Neil Williams
(As part of the CLI, I'll be reviewing the private headers exported by
QOF soon and qofinstance-p.h will be removed from EXTRA_DIST which should
prevent such shortcuts in the future. qofid-p.h will also be removed.)
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
It does. Whilst CashUtil remains a separate project, it will provide a
valuable testbed for what should and should not go into EXTRA_DIST with
regard to QOF. I'm concentrating on making sure that QOF builds correctly and
those changes will be put into GnuCash such that GnuCash will use QOF
correctly too.

I'll be changing calls across GnuCash such that GnuCash doesn't look for
files/functions/variables that won't be exported in libqof1.

This will also make it easier to move CashUtil into the GnuCash tree if that
becomes viable.
Post by David Hampton
Removing any header files from this will cause 'make distcheck' to fail,
Not once I'm through because I'll be changing any calls that would break the
distcheck.

PilotQOF and CashUtil are small projects and I'll be developer and maintainer
for each so I will be very strict with EXTRA_DIST in QOF.

Inside GnuCash, I'll be looking at ways to make the CLI work AND at ways to
ease the completion of the QOF spinout so that when the time comes, there are
no major dramas in the Makefiles.

Once G2 is out, I envisage building GnuCash against libqof1 as the default on
my system and that should iron out lots of those problems. I don't want to be
manually synchronising QOF source files much beyond G2! It's a PITA!
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-20 20:22:36 UTC
Permalink
Post by David Hampton
Post by Neil Williams
(As part of the CLI, I'll be reviewing the private headers exported by
QOF soon and qofinstance-p.h will be removed from EXTRA_DIST which
should
Post by David Hampton
Post by Neil Williams
prevent such shortcuts in the future. qofid-p.h will also be removed.)
Ahh, I thought EXTRA_DIST controlled what went into the tarball.
It does. Whilst CashUtil remains a separate project, it will provide a
valuable testbed for what should and should not go into EXTRA_DIST with
regard to QOF. I'm concentrating on making sure that QOF builds correctly and
those changes will be put into GnuCash such that GnuCash will use QOF
correctly too.
Huh? Why is CashUtil depending upon the QOF build-tree? Shouldn't it only
depend on what gets installed during "make install"?

EXTRA_DIST has nothing to do with what happens during "make install".

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Neil Williams
2005-07-20 20:33:58 UTC
Permalink
Post by Derek Atkins
Huh? Why is CashUtil depending upon the QOF build-tree? Shouldn't it only
depend on what gets installed during "make install"?
See other message regarding noinst_HEADERS. Yes, CashUtil, PilotQOF and all
the others will only depend on the results of make install. That is indeed
how they are currently built and tested.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-20 20:37:38 UTC
Permalink
Post by Neil Williams
Post by Derek Atkins
Huh? Why is CashUtil depending upon the QOF build-tree? Shouldn't it
only
Post by Derek Atkins
depend on what gets installed during "make install"?
See other message regarding noinst_HEADERS. Yes, CashUtil, PilotQOF and all
the others will only depend on the results of make install. That is indeed
how they are currently built and tested.
Yea, crossed in the mail. We're on the same page now. :)

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Neil Williams
2005-07-20 18:06:13 UTC
Permalink
Post by David Hampton
O.K. I'm confused. I thought the point of switching to SQL was that the
data was always written through to the database, so that if gnucash
crashed at any point there would be no lost work.
The SQL backend has a cache capability - with remote backends, e.g. network or
SQL, this would load only enough data to make the book actually usable; it
would not cause *all* of the data to be loaded.

last_update is used for comparing the isntance version in local memory to that
in the remote server.

It's the wrong flag to use for what you want.
Post by David Hampton
Post by Neil Williams
The alternative method would involve keeping tabs on all instances and
sorting the various update times but then that structure would need to be
stored outside the library anyway (couldn't be a static).
How hard is it to keep track of the first time this session you issued
an UPDATE or DELETE command?
QOF does not support either UPDATE or DELETE (yet). It's only recently that
I've added INSERT. All editing is done via calls direct to the object - you
edit an Account and the dialog calls a function in Account.c not qofquery.c -
there is no central tally of dynamic data, only the static object
definitions.

When I use param_setfcn, it is up to the relevant function in the *object* to
set any dirty flags or do any other work it needs to do. QOF merely passes
the correct data to the correct function of that instance. It neither cares
nor understands about the object itself.

Also the session has no direct knowledge of changes within the book - it's
just the connection between the book and the filesystem. QofSession doesn't
particularly care if the data is dirty, it'll save it anyway.

Editing an Account sets that instance as dirty. The book doesn't know anything
about saving. Its just that whenever data is modified, the 'dirty' flag is
usually set. This is detected using qof_object_is_dirty which iterates
through each class and each collection for that class. i.e. it is
retrospective. QOF knows nothing about changes until it is asked. The reason
it's so fast currently is that most QOF objects don't *have* an 'is_dirty'
prototype in their object definitions and hence aren't checked to see if they
are dirty (unless the collection is marked as dirty in the Scheme code
somewhere).

As Derek pointed out, the dirty check is by class, not by instance and not
every instance can either set or detect the dirty flag of the collection.
(Unless this is handled in the Scheme).

This is another hurdle I need to clear for the CLI. I'll be looking at tying
this together so that maybe objects no longer call obj->inst.dirty = TRUE
directly but instead call qof_instance_make_dirty((QofInstance*) obj) which
in turn can check that the QofCollection is also marked as dirty. If I get
time to do that, then I could look at setting this as a time and maybe
holding a reference value (to see if this one should update the reference) in
QofBook. I don't know if I'll be able to do that before G2.

To find the oldest change since the last save currently requires iterating
through every *instance*, not just the classes or collections.
Post by David Hampton
Post by Neil Williams
"The secondary text provides the user with some context about the number
of changes that might be unsaved."
With a nice big picture showing time in minutes.
We can do that - time since the last save. What might not be worth the effort
is time since the last change.
Post by David Hampton
I don't hear an alternative context proposal here. What sort of context
do you propose should be provided to users in place of a time?
Time since last save, not time since last change.
Post by David Hampton
QOF is already keeps dirty flags for every item in knows about.
Not quite. Each instance *can* keep it's dirty flag but many do not choose to
do so. Those that do, do not always set the dirty flag on their own
collection and most do not set a function prototype that can detect changes
via the book/object interface.
Post by David Hampton
How
hard would it be to change the code that marks objects as modified from
In each case, it's the instance that is marked as dirty - not the object
(which would make all instances of that object dirty).
Post by David Hampton
dirty = TRUE
if (dirty == 0)
dirty = time();
That's not the hard part. The hard part is then collating those timestamps and
sorting them. Currently, all that matters is that at least *one* instance has
set a dirty flag - it matters not when or who. What you want is a
comprehensive dirty flag that catches edits that may not currently be setting
any dirty flags.

How much of this is currently done in Scheme?

Why are the 'is_dirty' prototypes NULL in almost every object?

Are there situations where dirty instances slip through the net because the
collection is not also set as dirty? (If yes, I'll fix it.)
Post by David Hampton
When you try and close the main gnucash window, QOF already iterates
over the entire set of objects. It must, in order to determine if a
save dialog is needed.
See above. Iterating over the objects (i.e. the collections) is NOT the same
as iterating over every single instance. That is left to the backend during
the actual save. Iterating over every single Split just to find one that's
changed is not part of the dirty check.
Post by David Hampton
The best case today is that the first object is
modified and the iteration can bail. The worst case today is that every
object has to be inspected to discover that it has not been modified.
Objects are far less numerous than instances.
Post by David Hampton
and then a possible time comparison. Yes, the entire object tree would
have to be run each time, but is that really a big deal?
Iterating over the entire instance 'tree' would be a big deal, yes.
Post by David Hampton
Agreed, but who cares about the time period since the last save. I
could care less that I haven't saved in 25 days if there aren't any
changes to the file. If there are changes to the file, I would like to
know that the change was 42 minutes ago, not wonder what I changed in
the last 25 days.
It depends how it works out with the CLI code.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-20 19:16:42 UTC
Permalink
Post by David Hampton
Post by Neil Williams
+ /*
+ * *** THIS DIALOG IS NOT HIG COMPLIANT. ***
+ *
+ * According to the HIG, the secondary context should include
+ * context about the number of changes that will be lost (either in
+ * time or a count). While it is possible to simply provide the
+ * time since the last save, that doesn't appear too usefule. If
+ * the user has had Gnucash open for hours in the background, but
+ * only made a change in the last few minutes, then telling them
+ * they will lose hours work of work is wring. The QOF code needs
+ * to be modified to provide better timing information. The best
+ * case scenario would be if QOF could provide a timestamp of the
+ * oldest unsaved change.
+ */
The SQL backend uses qof_instance_get_last_update but this dialog would
require iterating over every instance in the book, one type at a time prior
to displaying the dialog, then sorting the timespecs (and this when the user
is waiting for the app to close!)
O.K. I'm confused. I thought the point of switching to SQL was that the
data was always written through to the database, so that if gnucash
crashed at any point there would be no lost work.
Post by Neil Williams
The alternative method would involve keeping tabs on all instances and sorting
the various update times but then that structure would need to be stored
outside the library anyway (couldn't be a static).
How hard is it to keep track of the first time this session you issued
an UPDATE or DELETE command? I don't need to know what it was, only
when it was. Isn't the SQL code contained in a small set of files?
Post by Neil Williams
Do other applications interpret "Time Period" in the above manner?
I don't know. I would love to see "... your 5 changes from the past 1
hour and 30 minutes will be discarded" but I figured that was asking for
too much.
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
With a nice big picture showing time in minutes.
Post by Neil Williams
IMHO, "some context" does not mean "number of seconds since the last change,
anywhere in a 2Mb book". The iterations themselves could delay the display of
the dialog. I don't see how this is a job for the QOF library.
I don't hear an alternative context proposal here. What sort of context
do you propose should be provided to users in place of a time?
QOF is already keeps dirty flags for every item in knows about. How
hard would it be to change the code that marks objects as modified from
dirty = TRUE
if (dirty == 0)
dirty = time();
When you try and close the main gnucash window, QOF already iterates
over the entire set of objects. It must, in order to determine if a
save dialog is needed. The best case today is that the first object is
modified and the iteration can bail. The worst case today is that every
object has to be inspected to discover that it has not been modified.
This is today in every extant version of Gnucash. The dialog is
different, but the code path to see if the dialog is needed is the same.
In order to provide a time in the context dialog, a boolean test would
be replaced with a test for a non-zero time (still a single instruction)
and then a possible time comparison. Yes, the entire object tree would
have to be run each time, but is that really a big deal? Have you
noticed a long delay when quitting gnucash when you haven't changed
anything? The additional time comparison code would only executed for
each objects that is modified.
Post by Neil Williams
How many people really do leave GnuCash running in the background?
I have on occasion.
so have I.
Post by David Hampton
Post by Neil Williams
http://developer.gnome.org/projects/gup/hig/1.0/windows.html#alerts-confirmation
The example itself only shows a simple time period, no hint that this has to
be the time period since the last modification or the last save.
Agreed, but who cares about the time period since the last save. I
could care less that I haven't saved in 25 days if there aren't any
changes to the file. If there are changes to the file, I would like to
know that the change was 42 minutes ago, not wonder what I changed in
the last 25 days.
I'm really all for the HIG and everything but I think this question of
what to display is moot. Right now, I'd be happy if it only asked me
to save if I really changed something and closed without asking when I
didn't. Providing *more* context ("N
minutes/bytes/transactions/whatever") just makes things *worse* when
the test for dirtiness was a false positive anyway.

We're already basically lying to the user about what he's done to his
data, so now we want to include lots of specific details to make our
lie more credible? :) "No... REALLY... you made changes to your
Checking Account 8 minutes and 32 seconds ago when you opened the
register to look at recent activity. Do you want to save the changes
or not?!"

Correctness before compliance.

I know this bug is not going to get fixed anytime soon, but let's not
draw any more attention to the fact that we think the usee has changed
their data when they haven't.

-chris
Post by David Hampton
David
_______________________________________________
gnucash-patches mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-patches
David Hampton
2005-07-20 19:46:49 UTC
Permalink
Post by Chris Shoemaker
We're already basically lying to the user about what he's done to his
data,
Other that the bug that opening a register marks the book as changed, do
you have other examples?

David
Chris Shoemaker
2005-07-20 19:46:32 UTC
Permalink
Post by David Hampton
Post by Chris Shoemaker
We're already basically lying to the user about what he's done to his
data,
Other that the bug that opening a register marks the book as changed, do
you have other examples?
Nah. But opening a register is something I do pretty much every time
I open gc, so I'd bet this is behavior almost every user sees. Oh
well.

-chris
Post by David Hampton
David
Derek Atkins
2005-07-20 14:59:20 UTC
Permalink
Post by Neil Williams
+ /*
+ * *** THIS DIALOG IS NOT HIG COMPLIANT. ***
+ *
+ * According to the HIG, the secondary context should include
+ * context about the number of changes that will be lost (either in
+ * time or a count). While it is possible to simply provide the
+ * time since the last save, that doesn't appear too usefule. If
+ * the user has had Gnucash open for hours in the background, but
+ * only made a change in the last few minutes, then telling them
+ * they will lose hours work of work is wring. The QOF code needs
+ * to be modified to provide better timing information. The best
+ * case scenario would be if QOF could provide a timestamp of the
+ * oldest unsaved change.
+ */
The SQL backend uses qof_instance_get_last_update but this dialog would
require iterating over every instance in the book, one type at a time prior
to displaying the dialog, then sorting the timespecs (and this when the user
is waiting for the app to close!)
The alternative method would involve keeping tabs on all instances and sorting
the various update times but then that structure would need to be stored
outside the library anyway (couldn't be a static).
Do other applications interpret "Time Period" in the above manner?
I think that iterating through all the instances in the book is way
too much overhead. Iterating through each class would be okay, as there
are a limited number of classes.

However, I think that just saying something like "You last saved your
data at [timestamp]. If you quit now without saving then you will
lose all changes you've made in [delta]." Obviously this is only
required if changes have been made. This requires fixing the register
code so that opening the register does not automagically dirty the
book.
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
IMHO, "some context" does not mean "number of seconds since the last change,
anywhere in a 2Mb book". The iterations themselves could delay the display of
the dialog. I don't see how this is a job for the QOF library.
How many people really do leave GnuCash running in the background?
Lots! Don't make assumptions about how people use the app; you'll
always be wrong. Whenever you find yourself thinking "user's wont do
that" you'll undoubtedly get hit with tons of bug reports when users
"do that". :-/
Post by Neil Williams
http://developer.gnome.org/projects/gup/hig/1.0/windows.html#alerts-confirmation
The example itself only shows a simple time period, no hint that this has to
be the time period since the last modification or the last save.
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Dan Widyono
2005-07-20 15:57:24 UTC
Permalink
Post by Neil Williams
How many people really do leave GnuCash running in the background?
Lots!
Yeah. I do. :)

Dan W.
Andrew Sackville-West
2005-07-20 15:55:46 UTC
Permalink
<< munch>>
Post by Neil Williams
How many people really do leave GnuCash running in the background?
Lots! Don't make assumptions about how people use the app; you'll
always be wrong. Whenever you find yourself thinking "user's wont do
that" you'll undoubtedly get hit with tons of bug reports when users
"do that". :-/
<<snip>>

FWIW, mine is up all the time. literally. I am in the habit of saving
constantly... everytime I get back to the main accounts tab, I now
automatically click on save as data loss for me can be catastrophic (I
have over 1400 transactions and 200+ accounts so far this year) what
with payrolls and all that jazz. sorry, rambling. Mine is always
running. I go lurk now.

A
-derek
Chris Shoemaker
2005-07-20 19:03:33 UTC
Permalink
Post by Derek Atkins
Post by Neil Williams
+ /*
+ * *** THIS DIALOG IS NOT HIG COMPLIANT. ***
+ *
+ * According to the HIG, the secondary context should include
+ * context about the number of changes that will be lost (either in
+ * time or a count). While it is possible to simply provide the
+ * time since the last save, that doesn't appear too usefule. If
+ * the user has had Gnucash open for hours in the background, but
+ * only made a change in the last few minutes, then telling them
+ * they will lose hours work of work is wring. The QOF code needs
+ * to be modified to provide better timing information. The best
+ * case scenario would be if QOF could provide a timestamp of the
+ * oldest unsaved change.
+ */
The SQL backend uses qof_instance_get_last_update but this dialog would
require iterating over every instance in the book, one type at a time prior
to displaying the dialog, then sorting the timespecs (and this when the user
is waiting for the app to close!)
The alternative method would involve keeping tabs on all instances and sorting
the various update times but then that structure would need to be stored
outside the library anyway (couldn't be a static).
Do other applications interpret "Time Period" in the above manner?
I think that iterating through all the instances in the book is way
too much overhead. Iterating through each class would be okay, as there
are a limited number of classes.
However, I think that just saying something like "You last saved your
data at [timestamp]. If you quit now without saving then you will
lose all changes you've made in [delta]." Obviously this is only
required if changes have been made. This requires fixing the register
code so that opening the register does not automagically dirty the
book.
I've looked at this pretty closely. This is not easy to change.
Opening the register creates many cascades of events which dirty lots
of objects. In particular, this is due to the handling of the blank
split/trans. Showing a blank split/trans actually involves modifying
the financial objects. Avoiding this is one of the chief design goals
of my register re-write.

Incidentally, is this behavior specific to g2? I've never noticed it
in 1.x, but I imagine the register code hasn't changed much.

-chris
Post by Derek Atkins
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
IMHO, "some context" does not mean "number of seconds since the last change,
anywhere in a 2Mb book". The iterations themselves could delay the display of
the dialog. I don't see how this is a job for the QOF library.
How many people really do leave GnuCash running in the background?
Lots! Don't make assumptions about how people use the app; you'll
always be wrong. Whenever you find yourself thinking "user's wont do
that" you'll undoubtedly get hit with tons of bug reports when users
"do that". :-/
Post by Neil Williams
http://developer.gnome.org/projects/gup/hig/1.0/windows.html#alerts-confirmation
The example itself only shows a simple time period, no hint that this has to
be the time period since the last modification or the last save.
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
David Hampton
2005-07-20 19:49:03 UTC
Permalink
Post by Chris Shoemaker
Incidentally, is this behavior specific to g2? I've never noticed it
in 1.x, but I imagine the register code hasn't changed much.
It was added in HEAD and pulled into g2. Derek thinks the blank split
code should be rewritten. I'm for reverting the change to the
transaction scrubbing code that created the problem.

David
Chris Shoemaker
2005-07-20 19:55:44 UTC
Permalink
Post by David Hampton
Post by Chris Shoemaker
Incidentally, is this behavior specific to g2? I've never noticed it
in 1.x, but I imagine the register code hasn't changed much.
It was added in HEAD and pulled into g2. Derek thinks the blank split
code should be rewritten. I'm for reverting the change to the
transaction scrubbing code that created the problem.
Nice to know I'm not imagining things. The blank split code *is*
particularly bad, but I think the effort to fix/rewrite it in the
current register is probably 50% of the effort to rewrite the whole
register.

I looked pretty deeply into the register code a couple months ago
and I don't remember seeing any relatioship between transaction
scrubbing code and the transaction dirtying. Could you point me
toward what you mean?

-chris
Post by David Hampton
David
Derek Atkins
2005-07-20 20:09:38 UTC
Permalink
Post by Chris Shoemaker
Post by David Hampton
Post by Chris Shoemaker
Incidentally, is this behavior specific to g2? I've never noticed it
in 1.x, but I imagine the register code hasn't changed much.
It was added in HEAD and pulled into g2. Derek thinks the blank split
code should be rewritten. I'm for reverting the change to the
transaction scrubbing code that created the problem.
Nice to know I'm not imagining things. The blank split code *is*
particularly bad, but I think the effort to fix/rewrite it in the
current register is probably 50% of the effort to rewrite the whole
register.
I looked pretty deeply into the register code a couple months ago
and I don't remember seeing any relatioship between transaction
scrubbing code and the transaction dirtying. Could you point me
toward what you mean?
xaccTransCommitEdit() calls xaccScrubTrans() which calls a bunch of other
functions that add the extra splits and dirty the books.

The problem is that the register creates the blank transaction and blank split
and then calls xaccTransBeginEdit() and xaccTransCommitEdit() immediately,
before the user has created any data in the split. This causes the additional
split(s) to get created and commited, too, thereby dirtying the book.

What NEEDS to happen is that the register should call xaccTransBeginEdit() when
it creates the blank split/trans but _NOT_ call xaccTransCommitEdit() until
after the user actually enters data into the thing.

I do NOT believe this is 50% of a register rewrite. Far from it; I think it's
maybe a day or two of work for someone who doesn't know the register to move
that CommitEdit() and make sure the Begin's and Commit's are properly balanced
in all cases.

I am fairly confident that this will solve the problem.
Post by Chris Shoemaker
-chris
Post by David Hampton
David
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Chris Shoemaker
2005-07-20 20:57:34 UTC
Permalink
Post by Derek Atkins
Post by Chris Shoemaker
Post by David Hampton
Post by Chris Shoemaker
Incidentally, is this behavior specific to g2? I've never noticed it
in 1.x, but I imagine the register code hasn't changed much.
It was added in HEAD and pulled into g2. Derek thinks the blank split
code should be rewritten. I'm for reverting the change to the
transaction scrubbing code that created the problem.
Nice to know I'm not imagining things. The blank split code *is*
particularly bad, but I think the effort to fix/rewrite it in the
current register is probably 50% of the effort to rewrite the whole
register.
I looked pretty deeply into the register code a couple months ago
and I don't remember seeing any relatioship between transaction
scrubbing code and the transaction dirtying. Could you point me
toward what you mean?
xaccTransCommitEdit() calls xaccScrubTrans() which calls a bunch of other
functions that add the extra splits and dirty the books.
The problem is that the register creates the blank transaction and blank split
and then calls xaccTransBeginEdit() and xaccTransCommitEdit() immediately,
before the user has created any data in the split. This causes the additional
split(s) to get created and commited, too, thereby dirtying the book.
What NEEDS to happen is that the register should call xaccTransBeginEdit() when
it creates the blank split/trans but _NOT_ call xaccTransCommitEdit() until
after the user actually enters data into the thing.
I disagree. Moving the cursor through the register (which creates a
blank split on each Transaction I move through) should not call
xaccTransBeginEdit (and presumably xaccTransRollback) for each
Transaction. xaccTransBeginEdit should only be called once the user
starts to change data.
Post by Derek Atkins
I do NOT believe this is 50% of a register rewrite. Far from it; I think it's
maybe a day or two of work for someone who doesn't know the register to move
that CommitEdit() and make sure the Begin's and Commit's are properly balanced
in all cases.
I am fairly confident that this will solve the problem.
If the Rollback doesn't dirty anything, then you may be right, but I
don't think this is the best solution. However, it may be better than
what's there.

-chris
Post by Derek Atkins
Post by Chris Shoemaker
-chris
Post by David Hampton
David
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
Derek Atkins
2005-07-20 21:19:46 UTC
Permalink
Post by David Hampton
Post by Derek Atkins
Post by Chris Shoemaker
Post by David Hampton
Post by Chris Shoemaker
Incidentally, is this behavior specific to g2? I've never noticed
it
Post by Derek Atkins
Post by Chris Shoemaker
Post by David Hampton
Post by Chris Shoemaker
in 1.x, but I imagine the register code hasn't changed much.
It was added in HEAD and pulled into g2. Derek thinks the blank split
code should be rewritten. I'm for reverting the change to the
transaction scrubbing code that created the problem.
Nice to know I'm not imagining things. The blank split code *is*
particularly bad, but I think the effort to fix/rewrite it in the
current register is probably 50% of the effort to rewrite the whole
register.
I looked pretty deeply into the register code a couple months ago
and I don't remember seeing any relatioship between transaction
scrubbing code and the transaction dirtying. Could you point me
toward what you mean?
xaccTransCommitEdit() calls xaccScrubTrans() which calls a bunch of other
functions that add the extra splits and dirty the books.
The problem is that the register creates the blank transaction and blank
split
Post by Derek Atkins
and then calls xaccTransBeginEdit() and xaccTransCommitEdit() immediately,
before the user has created any data in the split. This causes the
additional
Post by Derek Atkins
split(s) to get created and commited, too, thereby dirtying the book.
What NEEDS to happen is that the register should call xaccTransBeginEdit()
when
Post by Derek Atkins
it creates the blank split/trans but _NOT_ call xaccTransCommitEdit()
until
Post by Derek Atkins
after the user actually enters data into the thing.
I disagree. Moving the cursor through the register (which creates a
blank split on each Transaction I move through) should not call
xaccTransBeginEdit (and presumably xaccTransRollback) for each
Transaction. xaccTransBeginEdit should only be called once the user
starts to change data.
Unfortunately the act of assigning the split to the transaction/account will
cause it to call Begin/Commit Edit.
Post by David Hampton
Post by Derek Atkins
I do NOT believe this is 50% of a register rewrite. Far from it; I think
it's
Post by Derek Atkins
maybe a day or two of work for someone who doesn't know the register to
move
Post by Derek Atkins
that CommitEdit() and make sure the Begin's and Commit's are properly
balanced
Post by Derek Atkins
in all cases.
I am fairly confident that this will solve the problem.
If the Rollback doesn't dirty anything, then you may be right, but I
don't think this is the best solution. However, it may be better than
what's there.
You don't need to call Rollback. You just call xaccTransDelete() and then
xaccTransCommit() to delete it. If you haven't made any changes then it
shouldn't dirty anything.
Post by David Hampton
-chris
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
David Hampton
2005-07-20 16:36:33 UTC
Permalink
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
BTW, The HIG also suggests adding a '*' to the window title if there are
unsaved change in the file.

http://developer.gnome.org/projects/gup/hig/2.0/windows-
primary.html#primary-window-titles

Perhaps change notifications should be propagated to a central location.
That way the code can test a single flag instead of scanning all the
objects clases whenever you want to determine if the data file is clean
or dirty. Makes the test before presenting the save dialog trivial.
Sadly it probably wouldn't allow counting of the number of changes, as
one new transaction would likely generate three or more notifications
(one for the transaction itself and one for each split.)

David
Chris Shoemaker
2005-07-20 18:53:53 UTC
Permalink
Post by David Hampton
Post by Neil Williams
"The secondary text provides the user with some context about the number of
changes that might be unsaved."
BTW, The HIG also suggests adding a '*' to the window title if there are
unsaved change in the file.
http://developer.gnome.org/projects/gup/hig/2.0/windows-
primary.html#primary-window-titles
Perhaps change notifications should be propagated to a central location.
That way the code can test a single flag instead of scanning all the
objects clases whenever you want to determine if the data file is clean
or dirty. Makes the test before presenting the save dialog trivial.
Sadly it probably wouldn't allow counting of the number of changes, as
one new transaction would likely generate three or more notifications
(one for the transaction itself and one for each split.)
That reminds me of a question I've had. ISTM, there's some vision of
"dirtiness" propagating from Instance to Collection. However, I think
it would make sense if dirtiness propagated up the containment
hierarchy. E.g. User dirties a Split, which dirties the split's
Transaction, and Account, which dirties the account's Group, which
dirties the Book. Now, a check for the need to save can just check
the book. And, the code that commits changes and clears the dirty
flag could also travserse the tree instead of committing whole
Collections, or searching in a Collection for the single split that was
dirtied.

Incidentally, why does Split not even have a dirty flag? What design
decision governs whether a core engine object should derive from
Instance or Entity?

-chris
Post by David Hampton
David
_______________________________________________
gnucash-patches mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-patches
Neil Williams
2005-07-20 19:19:40 UTC
Permalink
Post by Chris Shoemaker
That reminds me of a question I've had. ISTM, there's some vision of
"dirtiness" propagating from Instance to Collection.
There is now, yes.
Post by Chris Shoemaker
However, I think
it would make sense if dirtiness propagated up the containment
hierarchy. E.g. User dirties a Split, which dirties the split's
Transaction, and Account, which dirties the account's Group, which
dirties the Book.
Why such a tortuous path? Split -> Collection -> Book. Checking the book
automatically checks all collections. The Book won't know WHICH split has
been changed so you gain nothing.

With a backend that only stored dirty instances (e.g. by using a local cache -
SQL), then marking the Trans, Account and Group as dirty is
counter-productive. Those haven't changed, only the Split has changed - it
could make a big difference if the backend is actually a network connection.

These backends could identify which collections are dirty in the book - then
identify which instances are dirty by *only* having to iterate over a those
collections, instead of all collections and thereby all instances. In your
example, the backend would only have to iterate over the Splits, not the
entire book. If you make Group dirty everytime a Split is changed, you lose
all ability to identify *which* instances are actually dirty.
Post by Chris Shoemaker
Now, a check for the need to save can just check
the book.
With the new qof_instance_set_dirty() that will be done in one call.
Post by Chris Shoemaker
And, the code that commits changes and clears the dirty
flag could also travserse the tree instead of committing whole
Collections, or searching in a Collection for the single split that was
dirtied.
Ah now that would be slower. No point, as I see it, traversing the entire tree
- set the collection to clean and all done.

The QofCollection dirty marker is a single boolean for the entire collection,
it does not require iterating through the collection itself, either to set or
to clear.
Post by Chris Shoemaker
Incidentally, why does Split not even have a dirty flag?
I'll be changing that - every object will use qof_collection_is_dirty and
qof_collection_make_clean in their object definitions.
Post by Chris Shoemaker
What design
decision governs whether a core engine object should derive from
Instance or Entity?
?? All core structs contain a QofInstance which itself contains a QofEntity.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-20 19:35:20 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
That reminds me of a question I've had. ISTM, there's some vision of
"dirtiness" propagating from Instance to Collection.
There is now, yes.
Post by Chris Shoemaker
However, I think
it would make sense if dirtiness propagated up the containment
hierarchy. E.g. User dirties a Split, which dirties the split's
Transaction, and Account, which dirties the account's Group, which
dirties the Book.
Why such a tortuous path? Split -> Collection -> Book. Checking the book
automatically checks all collections. The Book won't know WHICH split has
been changed so you gain nothing.
Ah, I didn't realize the Collection's dirtiness propagated to the Book.
Post by Neil Williams
With a backend that only stored dirty instances (e.g. by using a local cache -
SQL), then marking the Trans, Account and Group as dirty is
counter-productive. Those haven't changed, only the Split has changed - it
could make a big difference if the backend is actually a network connection.
Maybe there needs to be a distiction between "is dirty itself" and
"contains something dirty." I think this was what Derek meant when he
was talking about supporting a different event for editing an account
than for adding a transaction. One would dirty the Account, the other
would only indicate that the Account contained a dirty Transaction.
Post by Neil Williams
These backends could identify which collections are dirty in the book - then
identify which instances are dirty by *only* having to iterate over a those
collections, instead of all collections and thereby all instances. In your
example, the backend would only have to iterate over the Splits, not the
entire book. If you make Group dirty everytime a Split is changed, you lose
all ability to identify *which* instances are actually dirty.
Post by Chris Shoemaker
Now, a check for the need to save can just check
the book.
With the new qof_instance_set_dirty() that will be done in one call.
Post by Chris Shoemaker
And, the code that commits changes and clears the dirty
flag could also travserse the tree instead of committing whole
Collections, or searching in a Collection for the single split that was
dirtied.
Ah now that would be slower. No point, as I see it, traversing the entire tree
- set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.

(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
Post by Neil Williams
The QofCollection dirty marker is a single boolean for the entire collection,
it does not require iterating through the collection itself, either to set or
to clear.
Then you either need a linear search through 100000 Splits, or just to
commit all of them because you only know that the Collection is dirty,
not that it contains 1 dirty Split.
Post by Neil Williams
Post by Chris Shoemaker
Incidentally, why does Split not even have a dirty flag?
I'll be changing that - every object will use qof_collection_is_dirty and
qof_collection_make_clean in their object definitions.
Post by Chris Shoemaker
What design
decision governs whether a core engine object should derive from
Instance or Entity?
?? All core structs contain a QofInstance which itself contains a QofEntity.
Except for Split, I guess? (I'm actually not looking at the code, so
I may be misremembering.)

-chris
Post by Neil Williams
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
David Hampton
2005-07-20 20:07:24 UTC
Permalink
Post by Chris Shoemaker
Post by Neil Williams
Why such a tortuous path? Split -> Collection -> Book. Checking the book
automatically checks all collections. The Book won't know WHICH split has
been changed so you gain nothing.
Ah, I didn't realize the Collection's dirtiness propagated to the Book.
Terminology check here. As far as I can tell, in the current code
dirtiness doesn't *propagate* anywhere. Checking whether the book is
dirty automatically checks whether any of the collections are dirty.
This is a "pull" system, not a "push" system. If gnucash was a push
system I'd have everything I need to fix the dialog and the window
title.

David
Chris Shoemaker
2005-07-20 20:21:54 UTC
Permalink
Post by David Hampton
Post by Chris Shoemaker
Post by Neil Williams
Why such a tortuous path? Split -> Collection -> Book. Checking the book
automatically checks all collections. The Book won't know WHICH split has
been changed so you gain nothing.
Ah, I didn't realize the Collection's dirtiness propagated to the Book.
Terminology check here. As far as I can tell, in the current code
dirtiness doesn't *propagate* anywhere. Checking whether the book is
dirty automatically checks whether any of the collections are dirty.
Ah, ok. Gotcha.
Post by David Hampton
This is a "pull" system, not a "push" system. If gnucash was a push
system I'd have everything I need to fix the dialog and the window
title.
Yes, such a model would make the register code easier, too.

-chris
Post by David Hampton
David
Neil Williams
2005-07-20 20:05:21 UTC
Permalink
Post by Chris Shoemaker
Post by Neil Williams
With a backend that only stored dirty instances (e.g. by using a local
cache - SQL), then marking the Trans, Account and Group as dirty is
counter-productive. Those haven't changed, only the Split has changed -
it could make a big difference if the backend is actually a network
connection.
Maybe there needs to be a distiction between "is dirty itself" and
"contains something dirty."
Yes, but not in the way you describe below!
:-))
Post by Chris Shoemaker
I think this was what Derek meant when he
was talking about supporting a different event for editing an account
than for adding a transaction. One would dirty the Account, the other
would only indicate that the Account contained a dirty Transaction.
I think that can be handled through the existing event handler interface,
without more confusion about what is meant by dirty.

Derek: Did you mean that the event would make a parent instance dirty (i.e.
Trans -> Account) or just that it would trigger different events in the GUI
according to whether it was a Trans or an Account?
Post by Chris Shoemaker
Post by Neil Williams
Ah now that would be slower. No point, as I see it, traversing the entire
tree - set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.
(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
The SQL backend is the only one that can handle such differentiation in
storage and that can handle such things using the last_update mechanism. I
don't see the need for a second dirty mechanism.

last_update was (AFAICT) designed to cover exactly your example.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-20 20:20:15 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
I think this was what Derek meant when he
was talking about supporting a different event for editing an account
than for adding a transaction. One would dirty the Account, the other
would only indicate that the Account contained a dirty Transaction.
I think that can be handled through the existing event handler interface,
without more confusion about what is meant by dirty.
Derek: Did you mean that the event would make a parent instance dirty (i.e.
Trans -> Account) or just that it would trigger different events in the GUI
according to whether it was a Trans or an Account?
Neither... I think we need more types of events. Right now there is no
difference between the event when you change an Account Name and when you add a
new Transaction to the account. Right now it's all MOD_EVENT, but we probably
need a MOD_EVENT and ADDREM_EVENT to differentiate between these cases. We
might want even more, but I'm not sure offhand.

I was just saying that right now we have a "mark_dirty()" static inline function
in every object file that effectively does the same thing. We should abstract
that out into qofinstance and have a "qof_instance_mark_dirty" API that lets us
put that code into a single place so all qof objects handle changes in the same
way.

This new API could theoretically also tie into the event system so calling
qof_instance_mark_dirty() could also send an appropriate event... But then
we'd need different APIs (probably macro'd) to differentiate between the event
types.
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
Ah now that would be slower. No point, as I see it, traversing the
entire
Post by Chris Shoemaker
Post by Neil Williams
tree - set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.
(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
Eh? You're very confused, Neil... At least about how GnuCash does it. Your QOF
stuff might behave differently, but there is no "tree". There are lists of
objects with relations. There is no knowledge about Accounts.. And what about
obects that don't fit into accounts?

Also, this is only an issue with XML, and XML is always re-written from scratch.
There are no partial writes to an XML backend. So again you don't need a
partial search. Note that with SQL we don't have any of these problems,
because the data is saved when you actually commit the change, in real-time.
So the book is never dirty.

No, you need to do a full object search to determine if something changed..
Although we COULD limit this to each base object class, which is how it is
currently implemented.
Post by Neil Williams
The SQL backend is the only one that can handle such differentiation in
storage and that can handle such things using the last_update mechanism. I
don't see the need for a second dirty mechanism.
last_update was (AFAICT) designed to cover exactly your example.
No, it wasn't. It was designed to make sure your cache of the data is as fresh
as the data in the SQL backend, so when you have multiple users you can detect
when someone else changes the data out from under you. It was not designed to
compute what you've changed in your local cache in order to tell you it.
Although I suppose it COULD be used in that fashion, that was not designed to
do that.

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Chris Shoemaker
2005-07-20 20:46:35 UTC
Permalink
Post by Derek Atkins
Post by Neil Williams
Post by Chris Shoemaker
I think this was what Derek meant when he
was talking about supporting a different event for editing an account
than for adding a transaction. One would dirty the Account, the other
would only indicate that the Account contained a dirty Transaction.
I think that can be handled through the existing event handler interface,
without more confusion about what is meant by dirty.
Derek: Did you mean that the event would make a parent instance dirty (i.e.
Trans -> Account) or just that it would trigger different events in the GUI
according to whether it was a Trans or an Account?
Neither... I think we need more types of events. Right now there is no
difference between the event when you change an Account Name and when you add a
new Transaction to the account. Right now it's all MOD_EVENT, but we probably
need a MOD_EVENT and ADDREM_EVENT to differentiate between these cases. We
might want even more, but I'm not sure offhand.
I was just saying that right now we have a "mark_dirty()" static inline function
in every object file that effectively does the same thing. We should abstract
that out into qofinstance and have a "qof_instance_mark_dirty" API that lets us
put that code into a single place so all qof objects handle changes in the same
way.
This new API could theoretically also tie into the event system so calling
qof_instance_mark_dirty() could also send an appropriate event... But then
we'd need different APIs (probably macro'd) to differentiate between the event
types.
Hmm.. bringing these two together makes a lot of sense.
Post by Derek Atkins
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
Ah now that would be slower. No point, as I see it, traversing the
entire
Post by Chris Shoemaker
Post by Neil Williams
tree - set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.
(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
Eh? You're very confused, Neil... At least about how GnuCash does it. Your QOF
stuff might behave differently, but there is no "tree". There are lists of
objects with relations. There is no knowledge about Accounts.. And what about
obects that don't fit into accounts?
[ You may not have noticed that I wrote what was above, not Neil. I
wasn't trying to describe what actually happens anywhere, just
explaining that trees can be more efficeint than lists if you only
need to find the minimal changed set. ]
Post by Derek Atkins
Also, this is only an issue with XML, and XML is always re-written from scratch.
There are no partial writes to an XML backend. So again you don't need a
partial search. Note that with SQL we don't have any of these problems,
because the data is saved when you actually commit the change, in real-time.
So the book is never dirty.
No, you need to do a full object search to determine if something changed..
Although we COULD limit this to each base object class, which is how it is
currently implemented.
ISTM you have the two extreme cases: XML only needs to track dirtiness
at the file (book?) level, since it has to rewrite everything anyway.
SQL doesn't need to track dirtiness at all because commits are
immediate. So what does tracking Collection-level dirtiness buy us?

ISTM, if there's some middle-ground where dirtiness *does* need to be
tracked on each instance because commits are not immediate, and yet
commits don't have to write ALL the data, then you'd want to find the
dirtiness by tree search, not by linear search for each object type.

-chris
Post by Derek Atkins
Post by Neil Williams
The SQL backend is the only one that can handle such differentiation in
storage and that can handle such things using the last_update mechanism. I
don't see the need for a second dirty mechanism.
last_update was (AFAICT) designed to cover exactly your example.
No, it wasn't. It was designed to make sure your cache of the data is as fresh
as the data in the SQL backend, so when you have multiple users you can detect
when someone else changes the data out from under you. It was not designed to
compute what you've changed in your local cache in order to tell you it.
Although I suppose it COULD be used in that fashion, that was not designed to
do that.
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Neil Williams
2005-07-20 21:25:14 UTC
Permalink
Post by Chris Shoemaker
ISTM you have the two extreme cases: XML only needs to track dirtiness
at the file (book?) level
But the book looks to it's collections to determine if it is dirty.
Post by Chris Shoemaker
, since it has to rewrite everything anyway.
That only considers the current GnuCash XML backend - QSF can write out and
import anything you provide.
Post by Chris Shoemaker
SQL doesn't need to track dirtiness at all because commits are
immediate.
Yes.
Post by Chris Shoemaker
So what does tracking Collection-level dirtiness buy us?
Time!

(When asking the book if it is dirty). Plus the ability to use QSF which can
export only a single collection of entities.

With so few object types relative to instances, there's no advantage in
setting the flag in the book every time an instance is edited. Just set it in
the collection and the book will check the limited number of collections when
asked.
Post by Chris Shoemaker
ISTM, if there's some middle-ground where dirtiness *does* need to be
tracked on each instance because commits are not immediate,
You'd need some form of incremental storage. Not sure what you are getting at.
Post by Chris Shoemaker
and yet
commits don't have to write ALL the data, then you'd want to find the
dirtiness by tree search, not by linear search for each object type.
BTW, what does ISTM mean? I Seem To Remember, I know, ISTM?
Post by Chris Shoemaker
-chris
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Neil Williams
2005-07-20 20:55:07 UTC
Permalink
Post by Derek Atkins
I was just saying that right now we have a "mark_dirty()" static inline
function in every object file that effectively does the same thing. We
should abstract that out into qofinstance and have a
"qof_instance_mark_dirty" API that lets us put that code into a single
place so all qof objects handle changes in the same way.
That I've now done. qof_instance_set_dirty will replace all calls to
inst->dirty. Events and other calls can be added to qof_instance_set_dirty as
necessary.
Post by Derek Atkins
This new API could theoretically also tie into the event system so calling
qof_instance_mark_dirty() could also send an appropriate event... But then
we'd need different APIs (probably macro'd) to differentiate between the
event types.
Agreed.
Post by Derek Atkins
Post by David Hampton
Post by Chris Shoemaker
Post by Neil Williams
Ah now that would be slower. No point, as I see it, traversing the
entire
Post by Chris Shoemaker
Post by Neil Williams
tree - set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.
(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
Eh? You're very confused, Neil... At least about how GnuCash does it.
Your QOF stuff might behave differently, but there is no "tree".
?? Wasn't it Chris that started talking of things in trees? I may have
repeated the analogy but I am not at all confused about objects and
relations / references in GnuCash or QOF. There is a hierarchy of objects,
which Chris was referrring to, from Group -> Account -> Trans -> Split. That,
after all, is how the current XML backend writes things out, adding the bits
that don't fit using customised functions. QSF does not work that way, it
restricts itself to the objects and their references, exactly as you
describe. QOF, PilotQOF and CashUtil inherit exactly the same object and
reference methods.

I quite like the approach actually, it's easier to deal with objects in a book
as if the book was just a loose bag, rather than a set of rigid mailboxes
with some (like Group) more important than others. I found it quite awkward
that the current GnuCash XML ignores any Account that is not bound to a
Group.

Loose organisation via references is far easier for import/export work - it
makes partial books and partial searches possible within XML.
Post by Derek Atkins
There are
lists of objects with relations. There is no knowledge about Accounts..
And what about obects that don't fit into accounts?
Those that are QOF objects are queried as any other.
Post by Derek Atkins
Also, this is only an issue with XML, and XML is always re-written from
scratch. There are no partial writes to an XML backend.
Umm, QSF does partial writes and could be adapted to write out only those
entities that had changed since the last save quite easily. All you need is a
GList of those entities.

In fact that might not be a bad idea - if there's a problem with a backend,
the changed data could be written out as a failsafe. It could even go to
STDOUT if there's no writeable diskspace left.
Post by Derek Atkins
So again you don't
need a partial search.
But if you do need it at some other time, it can be done.
Post by Derek Atkins
Note that with SQL we don't have any of these
problems, because the data is saved when you actually commit the change, in
real-time. So the book is never dirty.
Agreed.
Post by Derek Atkins
No, you need to do a full object search to determine if something changed..
Although we COULD limit this to each base object class, which is how it is
currently implemented.
And that is how I'm working to enhance it, using qof_instance_set_dirty. As
discussed earlier, a complete iteration over all instances is overkill. No
harm is making it a check on all registered objects.
Post by Derek Atkins
No, it wasn't. It was designed to make sure your cache of the data is as
fresh as the data in the SQL backend, so when you have multiple users you
can detect when someone else changes the data out from under you.
OK, I see that.
Post by Derek Atkins
It was
not designed to compute what you've changed in your local cache in order to
tell you it. Although I suppose it COULD be used in that fashion, that was
not designed to do that.
:-)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-20 20:28:48 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
With a backend that only stored dirty instances (e.g. by using a local
cache - SQL), then marking the Trans, Account and Group as dirty is
counter-productive. Those haven't changed, only the Split has changed -
it could make a big difference if the backend is actually a network
connection.
Maybe there needs to be a distiction between "is dirty itself" and
"contains something dirty."
Yes, but not in the way you describe below!
:-))
Post by Chris Shoemaker
I think this was what Derek meant when he
was talking about supporting a different event for editing an account
than for adding a transaction. One would dirty the Account, the other
would only indicate that the Account contained a dirty Transaction.
I think that can be handled through the existing event handler interface,
without more confusion about what is meant by dirty.
Derek: Did you mean that the event would make a parent instance dirty (i.e.
Trans -> Account) or just that it would trigger different events in the GUI
according to whether it was a Trans or an Account?
Post by Chris Shoemaker
Post by Neil Williams
Ah now that would be slower. No point, as I see it, traversing the entire
tree - set the collection to clean and all done.
Not the entire tree, only the dirty portion. Extreme example: I have
100000 Splits, and edit one of them which happens to be in an Account
with only 5 Transactions. A tree search would search 1 book and find
1 dirty*, search 75 Accounts and find 1 dirty*, search 5 Transactions
and find 1 dirty*, search 2 splits and find 1 dirty**, and then commit
one split.
(*) here, I mean "contains something dirty"
(**) here, I mean "is itself dirty"
The SQL backend is the only one that can handle such differentiation in
storage and that can handle such things using the last_update mechanism. I
don't see the need for a second dirty mechanism.
last_update was (AFAICT) designed to cover exactly your example.
So you're saying that somehow, I don't have to search through 100000
Splits and I don't have to commit every Split, even though I'm only
recording dirtiness on the Collection? and this is unique to the SQL
backend? Is that because all edits are committed immediately and
dirtiness isn't even relevant?

-chris
Post by Neil Williams
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Neil Williams
2005-07-20 21:16:32 UTC
Permalink
Post by Chris Shoemaker
So you're saying that somehow, I don't have to search through 100000
Splits
Correct. When checking if the book is dirty, the object types are checked -
one type at a time. There are less than two dozen object types in GnuCash.
The call is to qof_collection_is_dirty() which reports back if it's internal
gboolean is TRUE or FALSE. End. No instances are searched.

That flag will now be set via qof_instance_set_dirty so that as soon as you
edit any Split (or whatever), the QofCollection for GNC_ID_SPLIT is marked as
dirty. Each object type only has one collection per book so if any one of
those splits is changed, the collection is marked as dirty and none of the
splits need to be checked.
Post by Chris Shoemaker
and I don't have to commit every Split
The SQL will commit the changes as they happen. For the XML, currently, yes
you do have to commit them all but that may change. QSF could easily cope
with committing only those entities that have changed - all it needs is a
list. If that was desirable, the list could be created as a secondary
collection. (Secondary collections are the basis of QOF_TYPE_COLLECT and can
hold any selection of entities (>=1) that match the collection type,
irrespective of how many exist - primary collections always hold *all* such
entities that exist in the book. Unless stated otherwise, collections are
deemed primary.)
Post by Chris Shoemaker
, even though I'm only
recording dirtiness on the Collection?
You set a Split to dirty in the Split instance, that sets the flag in the
collection.

When you check to see if an instance is dirty, if the collection is clean, the
instance sets and reports as clean.

So the recording still happens via the instance but it is triggers a flag in
the collection, if it's not already set.
Post by Chris Shoemaker
and this is unique to the SQL
backend? Is that because all edits are committed immediately and
dirtiness isn't even relevant?
To SQL yes, but SQL also uses last_update to only sync those entities that are
in current usage.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-20 23:57:49 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
So you're saying that somehow, I don't have to search through 100000
Splits
Correct. When checking if the book is dirty, the object types are checked -
one type at a time. There are less than two dozen object types in GnuCash.
The call is to qof_collection_is_dirty() which reports back if it's internal
gboolean is TRUE or FALSE. End. No instances are searched.
That flag will now be set via qof_instance_set_dirty so that as soon as you
edit any Split (or whatever), the QofCollection for GNC_ID_SPLIT is marked as
dirty. Each object type only has one collection per book so if any one of
those splits is changed, the collection is marked as dirty and none of the
splits need to be checked.
Post by Chris Shoemaker
and I don't have to commit every Split
The SQL will commit the changes as they happen. For the XML, currently, yes
you do have to commit them all but that may change. QSF could easily cope
with committing only those entities that have changed - all it needs is a
list.
Ok, this was my point. I completely understand that you can get a
very quick boolean answer to the question "has anything in the book
changed?" by checking each collection's dirty flag. But think about
*how* you'd have to create a list of all dirty entities for the case
where the task is to commit just the 1 split that changed in my
example. ISTM (It seems to me) there are 3 options: 1) You can't do
that; you must commit all 100000 Splits. 2) You can do that just
fine, but you must do a linear search through 100000 Splits to find
the 1 that changed. or 3) You start at the dirty book, and perform
the tree search I described before. The time cost difference between
2) and 3) can be arbitrarily large.

I can see that QSF only needs to handle lists of uniformly typed
entities. However, if there's no way to ask "are there dirty
Transactions in this Account", then *every* selection of a subset of
Splits for commiting will require a linear search through *all*
Splits. Does that seem like a problem to you?

-chris
Post by Neil Williams
If that was desirable, the list could be created as a secondary
collection. (Secondary collections are the basis of QOF_TYPE_COLLECT and can
hold any selection of entities (>=1) that match the collection type,
irrespective of how many exist - primary collections always hold *all* such
entities that exist in the book. Unless stated otherwise, collections are
deemed primary.)
Post by Chris Shoemaker
, even though I'm only
recording dirtiness on the Collection?
You set a Split to dirty in the Split instance, that sets the flag in the
collection.
When you check to see if an instance is dirty, if the collection is clean, the
instance sets and reports as clean.
So the recording still happens via the instance but it is triggers a flag in
the collection, if it's not already set.
Post by Chris Shoemaker
and this is unique to the SQL
backend? Is that because all edits are committed immediately and
dirtiness isn't even relevant?
To SQL yes, but SQL also uses last_update to only sync those entities that are
in current usage.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Neil Williams
2005-07-21 19:13:22 UTC
Permalink
Post by Chris Shoemaker
Ok, this was my point. I completely understand that you can get a
very quick boolean answer to the question "has anything in the book
changed?" by checking each collection's dirty flag. But think about
*how* you'd have to create a list of all dirty entities for the case
where the task is to commit just the 1 split that changed in my
example.
I had been - and it could be solved but I'd have to formalise the idea first.
I'm not sure what is the real-world use for such an API. I can see it as a
fallback for a failed write but that isn't particularly common. I can see it
for incremental storage systems but we don't use those yet. (SQL aside as
that can do this via a separate mechanism).
Post by Chris Shoemaker
ISTM (It seems to me) there are 3 options: 1) You can't do
that; you must commit all 100000 Splits. 2) You can do that just
fine, but you must do a linear search through 100000 Splits to find
the 1 that changed. or 3) You start at the dirty book, and perform
the tree search I described before.
Derek's point stands: The book knows nothing about the tree. There is no tree
within the book, it only exists in *our* conceptualisation of the
relationships between objects. All the book knows about are collections and
collections are not linked to each other - only objects link to other
objects.

Now it *could* be possible for the collection to keep a GList of changed
entities in it's own collection. The question is, is it worth doing?

Keep in mind that all existing mechanisms are retrospective - not much is done
until the question is asked. Storing a GList of modified entities would have
to be predictive: whether you need it or not, it would be being done. This
isn't just storing a single boolean value that covers tens of thousands of
entities, the GList would store each modified entity and could get incredibly
long in some cases. It may only be storing a pointer to the entity or maybe
it's GUID (as the type is already determined by the collection), but that
will mount up. It is conceivable with the SQL query dialog that I've got
planned for after G2, that the user could update every single instance of one
object type in one operation.
Post by Chris Shoemaker
The time cost difference between
2) and 3) can be arbitrarily large.
I think it would be too large to inflict on all users at all times for the odd
occasion that it might be useful.
Post by Chris Shoemaker
I can see that QSF only needs to handle lists of uniformly typed
entities. However, if there's no way to ask "are there dirty
Transactions in this Account"
The Account is marked dirty but the entities responsible are not currently
identifiable.
Post by Chris Shoemaker
, then *every* selection of a subset of
Splits for commiting will require a linear search through *all*
Splits. Does that seem like a problem to you?
It would if I could see a need to identify only these entities.

Currently, I can only see this as a solution in search of a problem.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-21 21:04:28 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
Ok, this was my point. I completely understand that you can get a
very quick boolean answer to the question "has anything in the book
changed?" by checking each collection's dirty flag. But think about
*how* you'd have to create a list of all dirty entities for the case
where the task is to commit just the 1 split that changed in my
example.
I had been - and it could be solved but I'd have to formalise the idea first.
I'm not sure what is the real-world use for such an API. I can see it as a
fallback for a failed write but that isn't particularly common. I can see it
for incremental storage systems but we don't use those yet. (SQL aside as
that can do this via a separate mechanism).
If by "incremental storage system" you mean something that commits
only what has changed, then we're on the same page. (Incidentally,
even "immediate-commit" systems sometimes fallback to "delayed-commit"
systems when they're in "offline" mode.)
Post by Neil Williams
Post by Chris Shoemaker
ISTM (It seems to me) there are 3 options: 1) You can't do
that; you must commit all 100000 Splits. 2) You can do that just
fine, but you must do a linear search through 100000 Splits to find
the 1 that changed. or 3) You start at the dirty book, and perform
the tree search I described before.
Derek's point stands: The book knows nothing about the tree. There is no tree
within the book, it only exists in *our* conceptualisation of the
relationships between objects. All the book knows about are collections and
collections are not linked to each other - only objects link to other
objects.
Now it *could* be possible for the collection to keep a GList of changed
entities in it's own collection. The question is, is it worth doing?
I think not. But this isn't what is required to be able to perform
incremental updates with out the linear searches. As you say, it's
predictive, not retrospective.
Post by Neil Williams
Keep in mind that all existing mechanisms are retrospective - not much is done
until the question is asked. Storing a GList of modified entities would have
to be predictive: whether you need it or not, it would be being done. This
isn't just storing a single boolean value that covers tens of thousands of
entities, the GList would store each modified entity and could get incredibly
long in some cases. It may only be storing a pointer to the entity or maybe
it's GUID (as the type is already determined by the collection), but that
will mount up. It is conceivable with the SQL query dialog that I've got
planned for after G2, that the user could update every single instance of one
object type in one operation.
Post by Chris Shoemaker
The time cost difference between
2) and 3) can be arbitrarily large.
I think it would be too large to inflict on all users at all times for the odd
occasion that it might be useful.
I think you may misunderstand. Both the linear search and the tree
search are retrospective, and the cost of the linear search for dirty
instances of all types will *always* be equal to or greater than the
tree search, and usually (in the cases where not everything is dirty)
it will be MUCH greater.

Proof: To find all the dirty instances of one type with a linear
search where at least one instance is dirty in a collection by type,
you must check every instance in the collection. With a tree search
you need not check any instance whose referent hasn't been marked as
"containing something dirty".
Post by Neil Williams
Post by Chris Shoemaker
I can see that QSF only needs to handle lists of uniformly typed
entities. However, if there's no way to ask "are there dirty
Transactions in this Account"
The Account is marked dirty but the entities responsible are not currently
identifiable.
Post by Chris Shoemaker
, then *every* selection of a subset of
Splits for commiting will require a linear search through *all*
Splits. Does that seem like a problem to you?
It would if I could see a need to identify only these entities.
Right, well, obviously, if you don't mind committing 100000 splits
when only one has changed, then the cost of finding the one doesn't
really matter.
Post by Neil Williams
Currently, I can only see this as a solution in search of a problem.
Maybe you're right, but let me play devil's advocate: I don't know the
current state of the backends, but imagine this scenario: Backend is
remote server, and connection to server goes down. What happens? One
option is that GC prevents the user from continuing to edit the data
on the screen. Option two is that GC alerts the user that the
connection went down and that changes will be committed to the server
when the connection comes back, if ever. Let's say we want option
two. The user adds/changes some splits and the connection comes back
so we want to commit what has changed. But how? Several options:

1) We cached the changes as they were made (as you describe in your
"predictive" method.) We just clear the cache.

2) We just send the entire Split collection to the backend and let
it figure out what changed.

3) We do a linear search through the Split collection to find the
few changes and commit those.

4) We do a tree search that finds that only one Account is marked as
"contains dirty Splits" so our linear search through Splits is only
through that Account's Splits instead of all Splits. We find the
changes and commit them.

Any of those options would work. But if this is something that
happens often, 2) and 3) will probably be unacceptably expensive.

Maybe GC will never have to address this issue because it will never
support an "offline" mode with a remote backend. If it does, 4) will
be easy to implement as long as instances store a reference to their
"parent", like Split does. The implementation is simply to do the
same thing to the parent's "contains something dirty" flag as you
currently want to do to the Collections "dirty" flag.

-chris
Neil Williams
2005-07-21 22:22:39 UTC
Permalink
Post by Chris Shoemaker
If by "incremental storage system" you mean something that commits
only what has changed, then we're on the same page.
Yes.
Post by Chris Shoemaker
(Incidentally,
even "immediate-commit" systems sometimes fallback to "delayed-commit"
systems when they're in "offline" mode.)
Yes.
Post by Chris Shoemaker
Post by Neil Williams
I think it would be too large to inflict on all users at all times for
the odd occasion that it might be useful.
I think you may misunderstand. Both the linear search and the tree
search are retrospective, and the cost of the linear search for dirty
instances of all types will *always* be equal to or greater than the
tree search, and usually (in the cases where not everything is dirty)
it will be MUCH greater.
Proof: To find all the dirty instances of one type with a linear
search where at least one instance is dirty in a collection by type,
you must check every instance in the collection. With a tree search
you need not check any instance whose referent hasn't been marked as
"containing something dirty".
My problem here is that the tree search is difficult to do in QOF because
there is no tree that QOF can understand. This would be one of the logic
functions in the intermediate library that is also being discussed - a
function specific to GnuCash and CashUtil.
Post by Chris Shoemaker
Post by Neil Williams
Currently, I can only see this as a solution in search of a problem.
:-)
Post by Chris Shoemaker
I don't know the
current state of the backends, but imagine this scenario: Backend is
remote server, and connection to server goes down. What happens?
Currently? I think GnuCash should fallback to a file:// url and save the
entire book to GnuCash XML v2. Actually, there is a note in the source about
this:
/* If there is a backend, and the backend is reachable
* (i.e. we can communicate with it), then synchronize with
* the backend. If we cannot contact the backend (e.g.
* because we've gone offline, the network has crashed, etc.)
* then give the user the option to save to the local disk.
*
* hack alert -- FIXME -- XXX the code below no longer
* does what the words above say. This needs fixing.
http://code.neil.williamsleesmill.me.uk/gnome2/qofsession_8c-source.html#l01226
(scroll down to line 1325)

I'll look at fixing that.

There is code in the backend handlers that falls back to file:// if the
preferred access method is not usable. That could easily be extended.
Post by Chris Shoemaker
One
option is that GC prevents the user from continuing to edit the data
on the screen. Option two is that GC alerts the user that the
connection went down and that changes will be committed to the server
when the connection comes back, if ever. Let's say we want option
two. The user adds/changes some splits and the connection comes back
so we want to commit what has changed. But how?
I think it's risky to offer option 2 without some kind of fallback - what if
the server is actually local and the problem is a sign of something more
serious - the user's system has become unstable etc.? Alternatively, the user
might just need to do something else and cannot keep GnuCash running until
the server comes back online.

That said, the SQL backend can use last_update to identify those instances
that have changed, both during the outage and afterwards, once the connection
is restored.

I'd envisage the user taking the option to save to a local file as the HIG /
intuitive action. Then, once the problem was fixed, the file (edited or not)
could be reloaded and use Save As... to re-establish the connection to the
remote server. Just as in any other situation where the backend receives a
whole new file, there will be increased network traffic until the two are
synchronised.

Saving to a local file will automatically reset all dirty flags anyway. We
cannot expect to preserve dirty flags if we give the user the (expected)
intuitive option to save to a local file in the event of a remote failure.
Post by Chris Shoemaker
1) We cached the changes as they were made (as you describe in your
"predictive" method.) We just clear the cache.
Yuk. I only gave that example to show how it wouldn't work!
:-)
Post by Chris Shoemaker
2) We just send the entire Split collection to the backend and let
it figure out what changed.
SQL can cope with that. All that happens is that on resuming the connection,
the network traffic increases until the SQL backend is back in sync.

After all, we are not the only application to use a remote connection to a SQL
server and this problem is not uncommon. As it is the server that deals with
the most events of this kind, I don't think it's unreasonable to expect the
server to have efficient code to handle the results of a connection restart,
independent of which application is using the server. In some situations,
it's even built into the protocol.
Post by Chris Shoemaker
3) We do a linear search through the Split collection to find the
few changes and commit those.
QOF isn't optimised to do that, SQL probably is.
Post by Chris Shoemaker
4) We do a tree search that finds that only one Account is marked as
"contains dirty Splits" so our linear search through Splits is only
through that Account's Splits instead of all Splits. We find the
changes and commit them.
To me, this is doing the work of the backend in the UI. Remember, the backend
- like the book - knows nothing about the tree. The only routines that know
anything about the conceptual hierarchy of Account over Split are the GUI
tree model functions.
Post by Chris Shoemaker
Any of those options would work. But if this is something that
happens often, 2) and 3) will probably be unacceptably expensive.
I'm still not convinced that this should be done in the UI. Any backend that
utilises a remote connection should be capable of handling outages in that
connection. That is the responsibility of the backend and it is a job best
left to the backend to sort out.
Post by Chris Shoemaker
Maybe GC will never have to address this issue because it will never
support an "offline" mode with a remote backend.
It should and I'll look at making the file:// fallback work.
Post by Chris Shoemaker
If it does, 4) will
be easy to implement as long as instances store a reference to their
"parent", like Split does. The implementation is simply to do the
same thing to the parent's "contains something dirty" flag as you
currently want to do to the Collections "dirty" flag.
The same problem keeps getting in the way. The book, the backend, the
collection and the entire query framework know nothing about the parental
relationship between Account and Split other than that it is an available
parameter of the relevant objects.

The tree is too specific - QOF is generic and does not get into the specific
conceptual relationships.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-22 01:33:02 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
If by "incremental storage system" you mean something that commits
only what has changed, then we're on the same page.
Yes.
Post by Chris Shoemaker
(Incidentally,
even "immediate-commit" systems sometimes fallback to "delayed-commit"
systems when they're in "offline" mode.)
Yes.
Post by Chris Shoemaker
Post by Neil Williams
I think it would be too large to inflict on all users at all times for
the odd occasion that it might be useful.
I think you may misunderstand. Both the linear search and the tree
search are retrospective, and the cost of the linear search for dirty
instances of all types will *always* be equal to or greater than the
tree search, and usually (in the cases where not everything is dirty)
it will be MUCH greater.
Proof: To find all the dirty instances of one type with a linear
search where at least one instance is dirty in a collection by type,
you must check every instance in the collection. With a tree search
you need not check any instance whose referent hasn't been marked as
"containing something dirty".
My problem here is that the tree search is difficult to do in QOF because
there is no tree that QOF can understand. This would be one of the logic
functions in the intermediate library that is also being discussed - a
function specific to GnuCash and CashUtil.
I haven't really been following that thread closely, but maybe QOF
isn't the right place for a tree search. I don't really know enough
to say.
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
Currently, I can only see this as a solution in search of a problem.
:-)
Post by Chris Shoemaker
I don't know the
current state of the backends, but imagine this scenario: Backend is
remote server, and connection to server goes down. What happens?
Currently? I think GnuCash should fallback to a file:// url and save the
entire book to GnuCash XML v2. Actually, there is a note in the source about
No, not currently. Ideally.
Post by Neil Williams
/* If there is a backend, and the backend is reachable
* (i.e. we can communicate with it), then synchronize with
* the backend. If we cannot contact the backend (e.g.
* because we've gone offline, the network has crashed, etc.)
* then give the user the option to save to the local disk.
*
* hack alert -- FIXME -- XXX the code below no longer
* does what the words above say. This needs fixing.
http://code.neil.williamsleesmill.me.uk/gnome2/qofsession_8c-source.html#l01226
(scroll down to line 1325)
I'll look at fixing that.
There is code in the backend handlers that falls back to file:// if the
preferred access method is not usable. That could easily be extended.
Post by Chris Shoemaker
One
option is that GC prevents the user from continuing to edit the data
on the screen. Option two is that GC alerts the user that the
connection went down and that changes will be committed to the server
when the connection comes back, if ever. Let's say we want option
two. The user adds/changes some splits and the connection comes back
so we want to commit what has changed. But how?
I think it's risky to offer option 2 without some kind of fallback - what if
the server is actually local and the problem is a sign of something more
serious - the user's system has become unstable etc.? Alternatively, the user
might just need to do something else and cannot keep GnuCash running until
the server comes back online.
That doesn't sound very good.
Post by Neil Williams
That said, the SQL backend can use last_update to identify those instances
that have changed, both during the outage and afterwards, once the connection
is restored.
I'd envisage the user taking the option to save to a local file as the HIG /
intuitive action. Then, once the problem was fixed, the file (edited or not)
could be reloaded and use Save As... to re-establish the connection to the
remote server. Just as in any other situation where the backend receives a
whole new file, there will be increased network traffic until the two are
synchronised.
Saving to a local file will automatically reset all dirty flags anyway. We
cannot expect to preserve dirty flags if we give the user the (expected)
intuitive option to save to a local file in the event of a remote failure.
You're describing the worst-case scenario. Going offline is not
necessarily an error condition. Many frontend/backend systems are
*designed* to do this as a regular part of operation. For those
systems, the solutions you describe are not acceptable. It may be
impossible to save everything to a local file, because client may have
only the portion of the data that he was editing/viewing. And
retransmitting *everything* he does have everytime he comes back
online isn't feasible either.
Post by Neil Williams
Post by Chris Shoemaker
1) We cached the changes as they were made (as you describe in your
"predictive" method.) We just clear the cache.
Yuk. I only gave that example to show how it wouldn't work!
:-)
Yeah, it's complicated, but some systems do this.
Post by Neil Williams
Post by Chris Shoemaker
2) We just send the entire Split collection to the backend and let
it figure out what changed.
SQL can cope with that. All that happens is that on resuming the connection,
the network traffic increases until the SQL backend is back in sync.
And how often would we resend the entire collection of splits? Every
time the SQL connection goes down and comes back up. Which could be
every 30 seconds. It's a good thing there are alternatives.
Post by Neil Williams
After all, we are not the only application to use a remote connection to a SQL
server and this problem is not uncommon. As it is the server that deals with
the most events of this kind, I don't think it's unreasonable to expect the
server to have efficient code to handle the results of a connection restart,
independent of which application is using the server. In some situations,
it's even built into the protocol.
Post by Chris Shoemaker
3) We do a linear search through the Split collection to find the
few changes and commit those.
QOF isn't optimised to do that, SQL probably is.
You can't optimize a linear search. You *have* to check every
instance.
Post by Neil Williams
Post by Chris Shoemaker
4) We do a tree search that finds that only one Account is marked as
"contains dirty Splits" so our linear search through Splits is only
through that Account's Splits instead of all Splits. We find the
changes and commit them.
To me, this is doing the work of the backend in the UI. Remember, the backend
You're saying the frontend doesn't need to know which instances are dirty?
Post by Neil Williams
- like the book - knows nothing about the tree. The only routines that know
anything about the conceptual hierarchy of Account over Split are the GUI
tree model functions.
Well, the functions in the engine know that an Account has a list of splits.
Post by Neil Williams
Post by Chris Shoemaker
Any of those options would work. But if this is something that
happens often, 2) and 3) will probably be unacceptably expensive.
I'm still not convinced that this should be done in the UI. Any backend that
utilises a remote connection should be capable of handling outages in that
connection. That is the responsibility of the backend and it is a job best
left to the backend to sort out.
I disagree. Both ends need to be intelligent. It is easy to put all
responsibilty on the backend, but resending all the data to the
backend just because the frontend has no concept of what's dirty is
inefficient.
Post by Neil Williams
Post by Chris Shoemaker
Maybe GC will never have to address this issue because it will never
support an "offline" mode with a remote backend.
It should and I'll look at making the file:// fallback work.
Well, bailing out to a file might be a nice way to handle severe
errors, but it doesn't make gc support "offline" mode. Like I said,
maybe it never will.
Post by Neil Williams
Post by Chris Shoemaker
If it does, 4) will
be easy to implement as long as instances store a reference to their
"parent", like Split does. The implementation is simply to do the
same thing to the parent's "contains something dirty" flag as you
currently want to do to the Collections "dirty" flag.
The same problem keeps getting in the way. The book, the backend, the
collection and the entire query framework know nothing about the parental
relationship between Account and Split other than that it is an available
parameter of the relevant objects.
The tree is too specific - QOF is generic and does not get into the specific
conceptual relationships.
In your view, where exactly are those relationships best represented?

-chris
Neil Williams
2005-07-22 08:45:08 UTC
Permalink
Post by Chris Shoemaker
Post by Neil Williams
I think it's risky to offer option 2 without some kind of fallback - what
if the server is actually local and the problem is a sign of something
more serious - the user's system has become unstable etc.? Alternatively,
the user might just need to do something else and cannot keep GnuCash
running until the server comes back online.
That doesn't sound very good.
? The user has to make that decision. If the backend is designed to go on and
offline then the QofBackend can deal with that. This would only come into
play when the user expects a save/commit and an error is reported. As an
option in that error dialog, the user needs some way of saving the changes
they have made so far - in case it IS a major fault with the server. GnuCash
cannot always detect that, neither can QOF, only the user has that
information.
Post by Chris Shoemaker
You're describing the worst-case scenario. Going offline is not
necessarily an error condition.
Yes, I'm only concerned with the error condition - a complete failure in the
connection. Everything else is backend-specific because only certain backends
can ever support incremental backups, partial data or remote connections.
Post by Chris Shoemaker
Many frontend/backend systems are
*designed* to do this as a regular part of operation. For those
systems, the solutions you describe are not acceptable.
Those would not report an error, they are designed to operate that way. If the
backend repeatedly fails to communicate with it's own server, it will have to
report an error eventually and this fallback would be available.

The backend mechanism is generic - it has to deal with each backend equally
and let the backend do what it can do best.
Post by Chris Shoemaker
It may be
impossible to save everything to a local file, because client may have
only the portion of the data that he was editing/viewing.
Then the book is a partial book and can be saved to QSF. The user cannot be
left dangling with no way to recover data entered after a remote connection
went down.
Post by Chris Shoemaker
And how often would we resend the entire collection of splits?
Only in case of a resumption after an error condition and only under direct
user control.
Post by Chris Shoemaker
Post by Neil Williams
Post by Chris Shoemaker
4) We do a tree search that finds that only one Account is marked as
"contains dirty Splits" so our linear search through Splits is only
through that Account's Splits instead of all Splits. We find the
changes and commit them.
To me, this is doing the work of the backend in the UI.
A backend that is customised to handle incremental backups and partial data
will be able to identify the modified instances using last_update without
EVER knowing anything about this "tree" that simply doesn't exist.

There is NO tree. (I wish I'd said that the first time you mentioned it - note
to self, do as Derek advised and NEVER allow any use of "tree" again!)
Post by Chris Shoemaker
Post by Neil Williams
Remember, the
backend
You're saying the frontend doesn't need to know which instances are dirty?
? No, I'm saying that the backend can do it's own thing, including on and off
cycles if it supports those.

The backend does not understand an Account. It has no concept of a Split. They
are all just objects. The backend cannot do a tree search, no tree exists. As
I showed in the previous message, the source code HAS no tree - it only
exists in the human mind, the UI and the docs.

There is NO direct relationship between an Account and it's Transactions.
None. Zip. No engine call can go from an Account to a Transaction, the source
code is simply not built to support that, neither does it need to.

Accounts know about their Splits and Splits know about Transactions and
Accounts. Transactions know nothing of Accounts and Accounts know nothing
about Transactions.

The engine knows NONE of this. All the engine cares is that there are three
objects and N parameters. What the engine doesn't know, the backend cannot
understand and the book cannot find.

That is why QOF can be spun-out as a generic framework, it has had all the
financial objects, all the human concepts and all the specific relationships
stripped out. It treats every entity the same way, everything is equal, no
hierarchy exists of any kind.

In some ways, I regret using the term book - it implies a structure with a
cover, a table of contents, an index, chapters and page numbers that simply
does not exist. In reality, it's closer to a tombola: lots of coloured balls
with numbers stamped on them, each one identical to all the others in all
other respects. Balls of the same colour are attached to each other
(collections) which is where the analogy breaks down.
Post by Chris Shoemaker
Well, the functions in the engine know that an Account has a list of splits.
No, they do not. The engine code does not know anything called an Account or a
Split. All it knows is that one object contains a pointer to another object
using a parameter identified using a particular static string. Some
parameters are strings, or integers or booleans etc. Until v.recently, no
object contained a list of anything, as far as the engine was concerned -
there was no parameter that could express a list of other objects.

At no point does the engine have any concept of a Split or an Account or any
other human concept. It's all just objects and parameters. That's it.
Post by Chris Shoemaker
Post by Neil Williams
The tree is too specific - QOF is generic and does not get into the
specific conceptual relationships.
In your view, where exactly are those relationships best represented?
1. The UI display
2. The docs
3. The minds of the developer and the user.

There is no place for specific conceptual relationships in the engine, the
backend, the book or the session.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-22 14:07:38 UTC
Permalink
?!? Are we talking about the same project? Gnucash?

<major snip>

Let me try to focus on the heart of the disconnect. The "offline
backend" was just one fictional example of a case where the knowledge
of financial relationships makes the difference between a reasonable
and unreasonable implementation. The real issue here is your
insistence that gnucash's "engine" has no concept of financial objects
or their relationships. Maybe there's a terminology mismatch here.
By "engine" I meant the code under src/engine/.

The lack of a relationship between Transaction and Account is not
evidence that there is no "tree" in the source code. Transactions and
Accounts don't have any direct relationship because they're not
supposed to. What sort of direct relationship would you expect them
to have?

You keep saying that the financial relationships can't be in engine,
because QOF doesn't know a Split from a Watermelon, can't be in the
backend because "what the engine doesn't know the backend cannot
understand", can't be in QOF because all the financial concepts have
been stripped out. Instead, you say that the financial relationships
need to be in the OBJECTS. What exactly does that mean? (And don't
bother explaining OOP to me; I just finished teaching a course in
OOP. I get it, trust me.)
Post by Neil Williams
The backend does not understand an Account. It has no concept of a Split. They
are all just objects. The backend cannot do a tree search, no tree exists. As
I showed in the previous message, the source code HAS no tree - it only
exists in the human mind, the UI and the docs.
There is NO direct relationship between an Account and it's Transactions.
None. Zip. No engine call can go from an Account to a Transaction, the source
code is simply not built to support that, neither does it need to.
Accounts know about their Splits and Splits know about Transactions and
Accounts. Transactions know nothing of Accounts and Accounts know nothing
about Transactions.
The engine knows NONE of this. All the engine cares is that there are three
objects and N parameters. What the engine doesn't know, the backend cannot
understand and the book cannot find.
That is why QOF can be spun-out as a generic framework, it has had all the
financial objects, all the human concepts and all the specific relationships
stripped out. It treats every entity the same way, everything is equal, no
hierarchy exists of any kind.
In some ways, I regret using the term book - it implies a structure with a
cover, a table of contents, an index, chapters and page numbers that simply
does not exist. In reality, it's closer to a tombola: lots of coloured balls
with numbers stamped on them, each one identical to all the others in all
other respects. Balls of the same colour are attached to each other
(collections) which is where the analogy breaks down.
Post by Chris Shoemaker
Well, the functions in the engine know that an Account has a list of splits.
No, they do not. The engine code does not know anything called an Account or a
Split. All it knows is that one object contains a pointer to another object
using a parameter identified using a particular static string. Some
parameters are strings, or integers or booleans etc. Until v.recently, no
object contained a list of anything, as far as the engine was concerned -
there was no parameter that could express a list of other objects.
At no point does the engine have any concept of a Split or an Account or any
other human concept. It's all just objects and parameters. That's it.
??? The "engine" you're talking about is not src/engine/.
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
The tree is too specific - QOF is generic and does not get into the
specific conceptual relationships.
In your view, where exactly are those relationships best represented?
1. The UI display
2. The docs
3. The minds of the developer and the user.
I expected you to add 4. OBJECTS.
Post by Neil Williams
There is no place for specific conceptual relationships in the engine, the
backend, the book or the session.
Um. Those financial relationships are a pretty important part of a
financial app. And they need to exist in *code*, in the data
structures (which they currently do, BTW), and preferably not in GUI
code. If you want to call something an "engine" that has no concept
of financial relationships, fine. There still needs to be "a library
of core inter-related financial data-structures and the operations
that manipulate them, using those relationships." Currently, that's
src/engine/, but we can call it something else if there's a consensus.

-chris
Post by Neil Williams
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Neil Williams
2005-07-22 15:13:11 UTC
Permalink
Post by Chris Shoemaker
?!? Are we talking about the same project? Gnucash?
:-) Yes.
Post by Chris Shoemaker
Let me try to focus on the heart of the disconnect. The "offline
backend" was just one fictional example of a case where the knowledge
of financial relationships makes the difference between a reasonable
and unreasonable implementation. The real issue here is your
insistence that gnucash's "engine" has no concept of financial objects
or their relationships. Maybe there's a terminology mismatch here.
By "engine" I meant the code under src/engine/.
By engine, I mean QOF. There is a difference there and it relates to the
objects. In src/engine, we have various objects that are central to GnuCash,
Account, Trans, Split, Lots. Those are not part of QOF. Those, together with
the business objects gncInvoice etc. in src/business/business-core/, comprise
what I think of as the "Object Layer". A layer of code that is between the UI
and QOF. It represents the majority of the financial logic, it represents
everything that these objects can and cannot do but it is not the sum of the
financial logic in GnuCash. It is distinct from QOF and this will become
v.clear when QOF is used as an external library.

I'm sorry if this wasn't clear. It comes from working with QOF as an external
library with non-financial objects, the dividing line becomes clearer than if
you look at it from within GnuCash.

In src/engine, some of the gnc-*.c|h and all qof*.c|h are QOF. The files with
capitals are not. Unfortunately, there are other files in src/engine (like
the budget) that don't fit this pattern and there are files like kvp* and
md5* that ARE part of QOF. There are also historical headers that redefine
QOF functions as gnc functions which confuse things more.

Here's the contents of the QOF equivalent directory:

C files:
gnc-date.c gnc-trace.c md5.c qofchoice.c
qofinstance.c qofquerycore.c qofsql.c
gnc-engine-util.c
guid.c qofbackend.c qofclass.c qofmath128.c
qofquery-deserial.c gnc-event.c kvp_frame.c qofbook.c
qofgobj.c qofobject.c qofquery-serialize.c
gnc-numeric.c kvp-util.c qof_book_merge.c
qofid.c qofquery.c qofsession.c

H files:
gnc-date.h guid.h qofbackend-p.h qofclass.h
qofinstance.h qofquerycore-p.h
qofsession-p.h
gnc-engine-util.h
kvp_frame.h qof-be-utils.h qofclass-p.h qofinstance-p.h
qofquery-deserial.h qofsql.h
gnc-event.h
kvp-util.h
qofbook.h qofgobj.h qofmath128.h qofquery.h
gnc-event-p.h kvp-util-p.h qof_book_merge.h
qof.h qofobject.h qofquery-p.h
gnc-numeric.h md5.h qofbook-p.h qofid.h
qofobject-p.h qofquery-serialize.h gnc-trace.h qofbackend.h
qofchoice.h qofid-p.h qofquerycore.h qofsession.h

Using QOF as an external library, all these files would be removed from
GnuCash src/engine with no loss of function. It's not ready for that yet and
it those areas where I'll be working on the GnuCash code.

None of the *-p.h private headers will be available to programs linked against
QOF and that is where I will be enhancing the API to allow changing the
current GnuCash code to use the API instead of private headers.

There are issues here, notably about who controls gnc-trace.c and I'll be
looking at ways to pass the identification of the app to gnc-trace so that
GnuCash can still produce GnuCash trace logs instead of qof.trace. That
shouldn't be too hard.
Post by Chris Shoemaker
The lack of a relationship between Transaction and Account is not
evidence that there is no "tree" in the source code.
(There is no tree, trust me!)
Post by Chris Shoemaker
You keep saying that the financial relationships can't be in engine,
Because those are defined in files like Account.c that are in the object
layer.
Post by Chris Shoemaker
because QOF doesn't know a Split from a Watermelon,
:-) Honest, it doesn't. QOF can quite easily cope with a book of biological
classifications of watermelons! It could cope with a whole range of data -
the only things I'm not sure about supporting so far are v.large amounts of
unbroken text and binary data.
Post by Chris Shoemaker
??? The "engine" you're talking about is not src/engine/.
No, it's not, the engine is QOF. That's why the object layer needs to be part
of this intermediate library used by CashUtil so that CashUtil is talking the
same language (i.e. using the same objects) as GnuCash, unlike my other QOF
applications that could be dealing with pilot-link datasets or GnoTime or
watermelons!
Post by Chris Shoemaker
Post by Neil Williams
Post by Chris Shoemaker
In your view, where exactly are those relationships best represented?
1. The UI display
2. The docs
3. The minds of the developer and the user.
I expected you to add 4. OBJECTS.
There are relationships in the objects but they don't form a tree.
:-)
Post by Chris Shoemaker
Um. Those financial relationships are a pretty important part of a
financial app.
Hence the intermediate library. They are of no use to pilot-link or GnoTime -
QOF really doesn't care so they need to be elsewhere. The sensible place is
where GnuCash and CashUtil can use the same source whilst allowing CashUtil
to be installed without GnuCash.
Post by Chris Shoemaker
And they need to exist in *code*, in the data
structures (which they currently do, BTW),
True.
Post by Chris Shoemaker
and preferably not in GUI
code. If you want to call something an "engine" that has no concept
of financial relationships, fine. There still needs to be "a library
of core inter-related financial data-structures and the operations
that manipulate them, using those relationships."
That's what I want to have as an intermediate library.
Post by Chris Shoemaker
Currently, that's
src/engine/, but we can call it something else if there's a consensus.
And elsewhere, otherwise the library would be easy. GnuCash, now, wouldn't be
the app it is without the business objects so these cannot be left out of the
CLI. It's the other logic, currently in areas of the UI, that will take a
little time to identify and sort.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-22 15:23:04 UTC
Permalink
Post by Neil Williams
In src/engine, some of the gnc-*.c|h and all qof*.c|h are QOF. The files with
capitals are not. Unfortunately, there are other files in src/engine (like
the budget) that don't fit this pattern and there are files like kvp* and
md5* that ARE part of QOF. There are also historical headers that redefine
QOF functions as gnc functions which confuse things more.
I wonder if in g2 we should just move the qof files out of src/engine and into
lib/qof? Too bad we're still using CVS; with SVN it would be easy to do that.
;)

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Neil Williams
2005-07-22 15:45:14 UTC
Permalink
Post by Derek Atkins
Post by Neil Williams
In src/engine, some of the gnc-*.c|h and all qof*.c|h are QOF. The files with
capitals are not. Unfortunately, there are other files in src/engine
(like the budget) that don't fit this pattern and there are files like
kvp* and md5* that ARE part of QOF. There are also historical headers
that redefine QOF functions as gnc functions which confuse things more.
I wonder if in g2 we should just move the qof files out of src/engine and
into lib/qof? Too bad we're still using CVS; with SVN it would be easy to
do that. ;)
That would be a great idea. I don't mind doing all the cvs remove cvs add,
it's not that much of a bind, it's the history of the files as they are now
that would presumably be lost? We'd lose that anyway when moving to an
external library and the changes are fairly consistent with the QOF tree so
there is a history, just not in the current tree.

One big advantage from my point of view is that lib/qof could have the same
Makefile as qof/ itself, making is easier to patch from the QOF tree.

Let me know and I'll fold this change into a commit - just as soon as I'm
ready. (And yes, I'll test v.v.v.v.v.v.v carefully and I can send you the
patch if you'd like to test it first!)
:-)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-22 15:53:13 UTC
Permalink
Post by Neil Williams
Post by Derek Atkins
I wonder if in g2 we should just move the qof files out of src/engine and
into lib/qof? Too bad we're still using CVS; with SVN it would be easy to
do that. ;)
That would be a great idea. I don't mind doing all the cvs remove cvs add,
it's not that much of a bind, it's the history of the files as they are now
that would presumably be lost? We'd lose that anyway when moving to an
external library and the changes are fairly consistent with the QOF tree so
there is a history, just not in the current tree.
One big advantage from my point of view is that lib/qof could have the same
Makefile as qof/ itself, making is easier to patch from the QOF tree.
Let me know and I'll fold this change into a commit - just as soon as I'm
ready. (And yes, I'll test v.v.v.v.v.v.v carefully and I can send you the
patch if you'd like to test it first!)
:-)
Nah, I don't think we're ready for it, now.. And honestly I don't want to lose
the history if I can avoid it..

I'm sort of thinking that we should wait until after the g2->head merge and then
we can clean up the CVS tree.

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Neil Williams
2005-07-22 19:40:38 UTC
Permalink
Post by Derek Atkins
:-)
Nah, I don't think we're ready for it, now.. And honestly I don't want to
lose the history if I can avoid it..
I can understand that.
Post by Derek Atkins
I'm sort of thinking that we should wait until after the g2->head merge and
then we can clean up the CVS tree.
Fine. As the CLI improves, I'll do some more test builds with GnuCash against
an external QOF and against lib/qof/ to smooth the transition and identify
problems locally.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-21 22:09:10 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
Ok, this was my point. I completely understand that you can get a
very quick boolean answer to the question "has anything in the book
changed?" by checking each collection's dirty flag. But think about
*how* you'd have to create a list of all dirty entities for the case
where the task is to commit just the 1 split that changed in my
example.
I had been - and it could be solved but I'd have to formalise the idea first.
I'm not sure what is the real-world use for such an API. I can see it as a
fallback for a failed write but that isn't particularly common. I can see it
for incremental storage systems but we don't use those yet. (SQL aside as
that can do this via a separate mechanism).
Post by Chris Shoemaker
ISTM (It seems to me) there are 3 options: 1) You can't do
that; you must commit all 100000 Splits. 2) You can do that just
fine, but you must do a linear search through 100000 Splits to find
the 1 that changed. or 3) You start at the dirty book, and perform
the tree search I described before.
Derek's point stands: The book knows nothing about the tree. There is no tree
within the book, it only exists in *our* conceptualisation of the
relationships between objects. All the book knows about are collections and
collections are not linked to each other - only objects link to other
objects.
Now it *could* be possible for the collection to keep a GList of changed
entities in it's own collection. The question is, is it worth doing?
Keep in mind that all existing mechanisms are retrospective - not much is done
until the question is asked. Storing a GList of modified entities would have
to be predictive: whether you need it or not, it would be being done. This
isn't just storing a single boolean value that covers tens of thousands of
entities, the GList would store each modified entity and could get incredibly
long in some cases. It may only be storing a pointer to the entity or maybe
it's GUID (as the type is already determined by the collection), but that
will mount up. It is conceivable with the SQL query dialog that I've got
planned for after G2, that the user could update every single instance of one
object type in one operation.
Post by Chris Shoemaker
The time cost difference between
2) and 3) can be arbitrarily large.
I think it would be too large to inflict on all users at all times for the odd
occasion that it might be useful.
Post by Chris Shoemaker
I can see that QSF only needs to handle lists of uniformly typed
entities. However, if there's no way to ask "are there dirty
Transactions in this Account"
The Account is marked dirty but the entities responsible are not currently
identifiable.
Post by Chris Shoemaker
, then *every* selection of a subset of
Splits for commiting will require a linear search through *all*
Splits. Does that seem like a problem to you?
It would if I could see a need to identify only these entities.
Currently, I can only see this as a solution in search of a problem.
Oh. I just thought of another "problem" solved by this solution:

David wants to put a '*' in the window title when the book is dirty.
No problem, query all the collections' dirty flag. Now, say we wanted
to extend the HIG usage to the sub windows.

We want an account window's title to have '*' is the account is dirty.
Of course, what we actually mean is "does the account contain any
dirty splits?" You may think this can be handled completely by CM
events, but perhaps not. Consider this scenario:

User opens an Account and adds a Trans with two splits.
Theoretically, we could catch a CM event here and mark that account
with a '*'. But, the the user jumps to the other Account in the
Transaction. This window wasn't around with the dirtying was done, so
it couldn't catch any events. But, it is dirty, so we'd want to mark
it with a '*'.

How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.

-chris
Post by Neil Williams
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
David Hampton
2005-07-21 22:55:16 UTC
Permalink
Post by Chris Shoemaker
David wants to put a '*' in the window title when the book is dirty.
No problem, query all the collections' dirty flag. Now, say we wanted
to extend the HIG usage to the sub windows.
There are no sub-windows in g2. There are only windows that can contain
a variety of page types (account hierarchies, registers, reports, etc),
and single purpose dialogs.
Post by Chris Shoemaker
How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.
Splits and Accounts aren't the only things that get dirty. Stock quotes
and commidities are two others. I want a simple test on the *book* that
I can use to see if anyting in the book (split, stock price, etc) is
dirty. Preferably a trivial test, as it will be called frequently.

Neil mentioned he will be modifying the code to use the new function
qof_instance_set_dirty() to set dirty flags on collections. If this
function also sets the dirty flag on the book containing the collection
then I have almost every thing I need. In a perfect world, after
setting the book dirty flag the code would also call a hook list of
functions.

David
Neil Williams
2005-07-21 23:14:32 UTC
Permalink
Post by David Hampton
Neil mentioned he will be modifying the code to use the new function
qof_instance_set_dirty() to set dirty flags on collections. If this
function also sets the dirty flag on the book containing the collection
then I have almost every thing I need.
It will. Every call to qof_book_not_saved automatically calls
qof_object_is_dirty which checks each collection in the book for the single
boolean value in each collection and returns TRUE if ANY collection is dirty.
The collection will (with the new code) be marked as dirty as soon as any
entity in that collection is marked as dirty.

http://code.neil.williamsleesmill.me.uk/gnome2/group__Book.html#ga15

It's fast because it's only checking a single boolean flag in each case and
there are only as many primary collections as there are registered object
definitions.
Post by David Hampton
In a perfect world, after
setting the book dirty flag the code would also call a hook list of
functions.
That can be arranged - Derek mentioned enhancing the event handler and the new
function provides a home for the call you need.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
David Hampton
2005-07-22 00:01:19 UTC
Permalink
Post by Neil Williams
Post by David Hampton
Neil mentioned he will be modifying the code to use the new function
qof_instance_set_dirty() to set dirty flags on collections. If this
function also sets the dirty flag on the book containing the collection
then I have almost every thing I need.
It will. Every call to qof_book_not_saved automatically calls
qof_object_is_dirty which checks each collection in the book for the single
boolean value in each collection and returns TRUE if ANY collection is dirty.
You clearly haven't understood my statements. I don't want to have to
run the list of collections each time I ask if the book is dirty.
That's too much work. I want to short circuit that entire process by
checking *one* *flag* in the book. Not the collections. The book. I
would like the code that marks the collection as dirty to also mark the
book as dirty. None of the code I'm working on cares what has changed,
only that something has.
Post by Neil Williams
The collection will (with the new code) be marked as dirty as soon as any
entity in that collection is marked as dirty.
http://code.neil.williamsleesmill.me.uk/gnome2/group__Book.html#ga15
It's fast because it's only checking a single boolean flag in each case and
there are only as many primary collections as there are registered object
definitions.
Yes, thanks. I've seen that code. (Insert wry face here.) Nowhere
have you stated that setting the dirty flag on a collection will also
set book->inst.dirty. In fact, you've gone through some serious
contortions to avoid saying it.

O.K. Lets try another approach. The following is what I want. Tell me
why you can't implement it.

gboolean
qof_book_is_dirty (QofBook *book)
{
return book->inst.dirty;
}

guint
qof_book_get_dirty_time (QofBook *book)
{
return book->dirty_time;
}

void
qof_book_set_dirty (QofBook *book)
{
book->inst.dirty = TRUE;
if (book->dirty_time == 0)
book->dirty_time = time();
}

void
qof_book_set_clean (QofBook *book)
{
book->inst.dirty = FALSE;
book->dirty_time = 0;
}

qof_instance_set_dirty (...)
{
...whatever...
qof_book_set_dirty(book);
}

It totally baffles me why this has taken more than three email messages.
Do you not have a back pointer from the collection to the book? That
seems like a *HUGE* oversight if that's the problem.

David
Neil Williams
2005-07-22 08:10:13 UTC
Permalink
Post by David Hampton
You clearly haven't understood my statements. I don't want to have to
run the list of collections each time I ask if the book is dirty.
That's too much work.
It's not. At no time do you iterate through the entire collections to do this.
The code does iterate over the tiny number of primary collections but that
takes no time at all. It takes longer to call the function itself than it
does to check 12 or so boolean pointers, especially once that function is in
a shared library.
Post by David Hampton
I want to short circuit that entire process by
checking *one* *flag* in the book. Not the collections. The book. I
would like the code that marks the collection as dirty to also mark the
book as dirty. None of the code I'm working on cares what has changed,
only that something has.
The only problem with that is that the dirty flag in the book is not exposed
directly (or won't be once I've cleaned up the private header file issue).
You need to call a function via the API and that function will check the
collections. It really is no overhead.
Post by David Hampton
Yes, thanks. I've seen that code. (Insert wry face here.) Nowhere
have you stated that setting the dirty flag on a collection will also
set book->inst.dirty. In fact, you've gone through some serious
contortions to avoid saying it.
Because, as yet, it doesn't but neither does that matter.
:-)

The book reports as dirty if the collections are dirty. As there are so few
collections, it makes more sense to check each collection when asked rather
than continuously set the book dirty flag every time another collection has
changed.

It's retrospective, not prospective. I see no need to check or set the book
dirty flag every single time a single entity is changed, again and again and
again throughout the entire session. Propagating the flag to the collection
is sufficient.

The API is: If the book detects that a collection is dirty, the book reports
itself as dirty.

Reading the dirty flag directly will not be possible (as this would expose the
entire private QofBook struct in the API).

It shouldn't have been exposed and UI source files should not be including
qofbook-p.h, it will not be available when using QOF as a library.
Post by David Hampton
O.K. Lets try another approach. The following is what I want. Tell me
why you can't implement it.
gboolean
qof_book_is_dirty (QofBook *book)
{
return book->inst.dirty;
}
That's currently called qof_book_not_saved().
Post by David Hampton
guint
qof_book_get_dirty_time (QofBook *book)
{
return book->dirty_time;
}
I'll consider that.
Post by David Hampton
void
qof_book_set_dirty (QofBook *book)
{
book->inst.dirty = TRUE;
if (book->dirty_time == 0)
book->dirty_time = time();
}
But it's pointless even checking this EVERY single time an entity changes.
Nobody wants that information, all we want is to detect the FIRST change.

This function would get called every time qof_instance_set_dirty() is called
and that's just overkill.

Instead of calling the collections ONCE each, you want the instance to call
the book EVERY single time it changes! Instead of 12 or so calls, you are
recommending tens of thousands when only ONE is necessary.

That's like the foot soldier telling the major-general every time he starts to
walk!
:-)

Instances get dirty, it's their job, it's what they do. Books don't need to
know every single time. Once is perfectly adequate.

There could be tens of thousands of calls that set an instance dirty during a
user session - many instances are set dirty repeatedly already. The book
simply doesn't need to be told, it can ask when it needs the information
(which is far less often than it may appear).

Better, IMHO, to check the collections and cache the value - after all it is
the UI that controls WHEN the data is saved and how. Until the UI issues a
save command, the first dirty flag is the same as the last, there's no point
calling set() every single time.

There's also no point in asking the book if it is dirty once you've got the
answer YES. It will stay dirty until the UI issues the qof_session_save().
The book itself has no control over the save operations - it cannot clear
it's own flag and it cannot force a save. This function shouldn't need to be
called every few seconds or at every window paint. Cache it and wait for a
Save.

There's no point asking the book if it is dirty again and again and again.
It's told you once, that's all it can do.

Therefore, qof_book_not_saved needs to be called far fewer times than you seem
to imagine.
Post by David Hampton
void
qof_book_set_clean (QofBook *book)
{
book->inst.dirty = FALSE;
book->dirty_time = 0;
}
At no time can the set_clean functions be public, these are reserved for the
backends. There is no situation where the UI can be allowed to set the book
to clean without going through a Save.
Post by David Hampton
It totally baffles me why this has taken more than three email messages.
Do you not have a back pointer from the collection to the book? That
seems like a *HUGE* oversight if that's the problem.
Yes, each collection has a pointer to it's book. I just don't see the need to
pass that flag back every single time an entity is ever changed. When you
need to know that information (which is after the first change ONLY), the
collections can be checked. I believe it is more efficient for the UI to
cache this value until such time as the UI issues a Save command when it is
rechecked.

We should be providing information when it is requested, not setting it
repeatedly when nobody cares anymore.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Derek Atkins
2005-07-22 14:18:27 UTC
Permalink
Post by Neil Williams
At no time can the set_clean functions be public, these are reserved for the
backends. There is no situation where the UI can be allowed to set the book
to clean without going through a Save.
Actually that's not true. An "exit without save" might need to do it,
too.

David: I should point out for clarity that the current implementation
of "is the book dirty" does not do a full object search; it's not
searching _through_ the collections, it's just iterating over the set
of Collections and looking at the metadata on each Collection (not
each object in each collection). So, this is still pretty fast, as
Neil said. I don't think we need to cache that information at this
point; checking the metadata on the dozen Collections doesn't take
much time.

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
David Hampton
2005-07-22 16:14:20 UTC
Permalink
Post by Neil Williams
Post by David Hampton
You clearly haven't understood my statements. I don't want to have to
run the list of collections each time I ask if the book is dirty.
That's too much work.
It's not. At no time do you iterate through the entire collections to do this.
The code does iterate over the tiny number of primary collections but that
takes no time at all. It takes longer to call the function itself than it
does to check 12 or so boolean pointers, especially once that function is in
a shared library.
Really? My way. Load the function argument (1) and function address
(2), call the is_dirty subroutine (1), load the structure offset (1),
read the value (1), return (1). Depeneding on whether or not a stack
frame is needed add another 3 instructions. 7 to 10 total instructions.

Your way. Load the function argument (1) and function address (2), call
the is_dirty subroutine (1), load the structure offset (1), read the
value (1), test for non-zero (1), load the address of the
object_is_dirty function (2) and call it(1). The count is already at
ten instructions and you haven't even started to check the objects. For
each object you need to load a pointer to the object (1), load an offset
(1), read the value (1) and test it (1). That's at least four
instructions per object, times your twelve objects, for 48 instructions.
Another two to unwind the stack and that's 50 more, for a total count of
at least 60 instructions. Depeneding on whether or not a stack frames
are needed, add another 6 to ten instructions. I come up with a number
of arount 70 instructions total for your method. Its also O(n) worst
case instead of O(1). I will concede that the value of n is not
excessively large.
Post by Neil Williams
The only problem with that is that the dirty flag in the book is not exposed
directly (or won't be once I've cleaned up the private header file issue).
And I've never asked for it to be.
Post by Neil Williams
You need to call a function via the API and that function will check the
collections. It really is no overhead.
Its anywhere from the equivalent of what I'm proposing, to seven times
more work.
Post by Neil Williams
Post by David Hampton
void
qof_book_set_dirty (QofBook *book)
{
book->inst.dirty = TRUE;
if (book->dirty_time == 0)
book->dirty_time = time();
}
But it's pointless even checking this EVERY single time an entity changes.
Nobody wants that information, all we want is to detect the FIRST change.
This *is* detecting the first change. Would you rather it test for
book->inst.dirty == FALSE? There has to be a test somewhere to
determine first vs. subsequent. In the proposed code it is a test of
zero/non-zero. It just so happens that the non-zero case is also used
as a timestamp.
Post by Neil Williams
Instead of calling the collections ONCE each, you want the instance to call
the book EVERY single time it changes! Instead of 12 or so calls, you are
recommending tens of thousands when only ONE is necessary.
Nice large number. Got data to back up that figure?
Post by Neil Williams
Instances get dirty, it's their job, it's what they do. Books don't need to
know every single time. Once is perfectly adequate.
How do you know if this is the time that the book needs to be set dirty
if you don't test its current value? If you are testing the value,
you've just moved my above test for first/subsequent outside of the
qof_book_set_dirty() function. You're still doing the test though.
Post by Neil Williams
Better, IMHO, to check the collections and cache the value - after all it is
the UI that controls WHEN the data is saved and how.
Today, yes. Tomorrow, who knows?
Post by Neil Williams
There's also no point in asking the book if it is dirty once you've got the
answer YES. It will stay dirty until the UI issues the qof_session_save().
The book itself has no control over the save operations - it cannot clear
it's own flag and it cannot force a save. This function shouldn't need to be
called every few seconds or at every window paint. Cache it and wait for a
Save.
That presupposes that the frontend know how all possible bakckends work.
I do not want to write code that has to play state coherency games to
make sure that it is correctly tracking the state of the backend. The
backend is the authority on whether or not the data is dirty, so the
frontend should ask when it wants to know. Current backends may not
save automatically, but that doesn't mean future backends will not.
Users have long asked for an autosave feature. That could be
implemented in either the frontend or the backend. The backend makes
more sense to me as only certain backends need autosave.
Post by Neil Williams
At no time can the set_clean functions be public, these are reserved for the
backends. There is no situation where the UI can be allowed to set the book
to clean without going through a Save.
Absolutely 100% wrong. Its called "Exit Without Saving." I need a
method to tell the backend to drop everything on the floor. Either that
or I need a global variable to pass data from point A in file 1, back
through the gtk idle loop to point B in file 2 to tell point B not to
bother checking to see if the data is dirty and should be saved before
quitting the program.

David
Derek Atkins
2005-07-22 18:09:04 UTC
Permalink
Post by David Hampton
Absolutely 100% wrong. Its called "Exit Without Saving." I need a
method to tell the backend to drop everything on the floor. Either that
or I need a global variable to pass data from point A in file 1, back
through the gtk idle loop to point B in file 2 to tell point B not to
bother checking to see if the data is dirty and should be saved before
quitting the program.
Shouldn't "qof_session_destroy()" do that for you? Do you need to
mark it clean because of some architectural issue or just because
that's the way it's implemented at the moment?
Post by David Hampton
David
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
David Hampton
2005-07-22 18:30:48 UTC
Permalink
Post by Derek Atkins
Post by David Hampton
Absolutely 100% wrong. Its called "Exit Without Saving." I need a
method to tell the backend to drop everything on the floor. Either that
or I need a global variable to pass data from point A in file 1, back
through the gtk idle loop to point B in file 2 to tell point B not to
bother checking to see if the data is dirty and should be saved before
quitting the program.
Shouldn't "qof_session_destroy()" do that for you? Do you need to
mark it clean because of some architectural issue or just because
that's the way it's implemented at the moment?
I'm catching the delete_event triggered by clicking on the window
manager close button. There will only be one content window open, but
it could have multiple views into the data (say the account tree, two
registers, and an invoice) and there could be any number of dialogs open
showing other aspects of the data in the book. I didn't want to call
qof_session_destroy() at this point because any one of these could have
references into the book data, and there's a lot of unwinding and
cleaning up to do. Frankly I don't know how well much of the code would
handle having the book yanked out from under it. Sure, it should handle
that, but how much has been tested? Its a can of worms I didn't want to
open. The safest path in terms of code stability seemed to me to mark
the book clean and call the pre-existing shutdown function. I know that
function works properly today. I don't know if it would work after
destroying the book. The shutdown code path expects to destroy the book
at some point during its execution, so code before that point may assume
the existence of the book and not test to see if its really there.

David
Neil Williams
2005-07-22 19:33:51 UTC
Permalink
Post by David Hampton
Really? My way. Load the function argument (1) and function address
(2), call the is_dirty subroutine (1), load the structure offset (1),
read the value (1), return (1). Depeneding on whether or not a stack
frame is needed add another 3 instructions. 7 to 10 total instructions.
You're neglecting all the extra calls back to the book during editing when it
is already dirty. Why bother? The dirty flag in the book is there for
non-Collection data that may be added (using qof_book_set_data). When calling
qof_book_not_saved() that's checked along with the collections. I really
can't see any point in changing that. What we have will meet your needs,
there's no need to add thousands of extra calls, all identical to the first.
Post by David Hampton
Post by Neil Williams
You need to call a function via the API and that function will check the
collections. It really is no overhead.
Its anywhere from the equivalent of what I'm proposing, to seven times
more work.
I disagree, having the collections call the book at every single edit is
pointless and wasteful. Just editing one transaction could fire off SIX of
these edit calls - what's the point? It's bad enough setting this in the
collection every time, that has to be done. The book does NOT need to be told
every single time. It's a step too far for every single edit.

When the book is asked, the book finds out and passes back the answer. Far
easier.

I'd rather have the current routine that is called infrequently than a call
that is called for absolutely no reason at least once every time anything in
the application is edited.
Post by David Hampton
This *is* detecting the first change.
And then it passes the call back for the next change and the one after that ad
infinitum.

This is getting us nowhere and just delaying the rest of the work. Can't we
agree to disagree on this?
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
David Hampton
2005-07-22 17:35:32 UTC
Permalink
Post by Neil Williams
The book reports as dirty if the collections are dirty. As there are so few
collections, it makes more sense to check each collection when asked rather
than continuously set the book dirty flag every time another collection has
changed.
It's retrospective, not prospective. I see no need to check or set the book
dirty flag every single time a single entity is changed, again and again and
again throughout the entire session. Propagating the flag to the collection
is sufficient.
So why have a book dirty flag at all? If you have the flag, you must
insure it reflects the state of the system. I'm not saying you have to
set the flag every time a change is made, just that you ensure that the
flag *always* reflects the state of the underlying collections. If it
doesn't do that then the code is broken.

Once you've ensure that the book flag correctly reflects the state of
the collections, what's the point of running the collections in
qok_book_not_saved()? Its redundant work.

David
Neil Williams
2005-07-22 19:57:05 UTC
Permalink
Post by David Hampton
So why have a book dirty flag at all?
For non-collection data (like the ENTITYREFERENCE hashtable). It's not a big
job - frankly. The burden is on the collections, where it should be.
Post by David Hampton
If you have the flag, you must
insure it reflects the state of the system.
No, that would only be true IF the flag could be called directly. I'll double
check that the flag is good for what (little) it does. The API function
reflects the true state of the system, the flag is only one part of that. The
private header file changes will ensure that any check on the dirtiness of
the book will go via the API: qof_book_not_saved as the flag itself will be
out of bounds.
Post by David Hampton
I'm not saying you have to
set the flag every time a change is made,
?? Then why the disagreement about passing the flag back from the collection
every time a change is made ?? There is no need to set or check the book flag
every time a change is made. Can we agree on that?
Post by David Hampton
just that you ensure that the
flag *always* reflects the state of the underlying collections.
It does, absolutely it does - when called via the API. That is precisely what
the new function ensures and it is all I have been talking about all along.

Honestly, this is driving me nuts because it's an argument about nothing. The
API has been fixed to work correctly and you're worried about the internal
workings behind that API? Please just use the API, it's fine! Don't use the
flag directly, do use qof_book_not_saved and all will be well. Promise.

If it goes wrong, THEN you can complain - right now I'm happy, the API is fine
and I'm not going to argue this any further.
Post by David Hampton
If it
doesn't do that then the code is broken.
It was and I've fixed it! The fix will be in my next commit which is getting
larger (and further away) the longer we spend time on these minutiae.

Thanks to all here for the heads-up on the qof_instance_set_dirty addition,
that's done and all will be well once it's committed. Can I PLEASE get back
to my urgent work with QSF and the CLI now? I can't fix that QOF_TYPE_COLLECT
problem with so many distracting discussions flying around.
Post by David Hampton
Once you've ensure that the book flag correctly reflects the state of
the collections, what's the point of running the collections in
qok_book_not_saved()? Its redundant work.
Because the book flag is only one part of qof_book_not_saved and the flag is
NOT in sync with the collections - qof_book_not_saved IS in sync (now) with
ALL parts of the book, so that is the API. Please use it.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
David Hampton
2005-07-22 20:36:42 UTC
Permalink
Post by Neil Williams
Post by David Hampton
I'm not saying you have to
set the flag every time a change is made,
?? Then why the disagreement about passing the flag back from the collection
every time a change is made ??
You don't have to set the flag each time a change is made if the flag is
already properly set to reflect the state of the book.
Post by Neil Williams
There is no need to set or check the book flag
every time a change is made. Can we agree on that?
No. If the flag were labelled "other non-collection data in the book" I
would agree with you. Its not. The line book->inst.dirty reads as "is
this instance of a book dirty?" If the name of the flag was something
like book->misc_stuff.inst.dirty then I could accept your
interpretation.
Post by Neil Williams
Post by David Hampton
just that you ensure that the
flag *always* reflects the state of the underlying collections.
It does, absolutely it does - when called via the API.
That's bullshit. The API correctly reflects the state of the book. The
single *flag* book->inst.dirty does not.
Post by Neil Williams
Honestly, this is driving me nuts because it's an argument about nothing. The
API has been fixed to work correctly and you're worried about the internal
workings behind that API?
Yes. QOF is still a part of the Gnucash source code so I worry about
the implementation, not just the API. You are not the only person that
will have to read and understand that code. Hidden assumptions like
this one (that book->inst.dirty doesn't have anything to do with the
state of the book, it only reflects that state of some subset of the
book) make code hard to read. If QOF were an external library like gtk
I wouldn't feel responsible for it.
Post by Neil Williams
Please just use the API, it's fine! Don't use the
flag directly, do use qof_book_not_saved and all will be well. Promise.
I have never once said that I want to use the flag directly. I have
stated that the implementation should be different, and that the code is
confusing.
Post by Neil Williams
If it goes wrong, THEN you can complain - right now I'm happy, the API is fine
and I'm not going to argue this any further.
Fine. Commit your changes, I'll get back around to doing HIG work on
the G2 port, and we'll go from there.

David

Neil Williams
2005-07-21 23:01:56 UTC
Permalink
Post by Chris Shoemaker
David wants to put a '*' in the window title when the book is dirty.
No problem, query all the collections' dirty flag. Now, say we wanted
to extend the HIG usage to the sub windows.
We want an account window's title to have '*' is the account is dirty.
Of course, what we actually mean is "does the account contain any
dirty splits?" You may think this can be handled completely by CM
User opens an Account and adds a Trans with two splits.
Theoretically, we could catch a CM event here and mark that account
with a '*'. But, the the user jumps to the other Account in the
Transaction. This window wasn't around with the dirtying was done, so
it couldn't catch any events. But, it is dirty, so we'd want to mark
it with a '*'.
How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.
This clearly illustrates the nature of the search: It is an OBJECT issue and
it needs to be done in the objects.

There is no harm in the dirty flag being set in the s->parent and passing that
back to the trans->parent (although this isn't currently part of how a Trans
is handled and there is no trans->parent). What cannot be done is for the
engine to follow that trail backwards because it cannot know that the Split
is a child of Trans. It's just another parameter of an object that happens to
have a name string consisting of "S""p""l""i" and "t" ("\0"). The engine has
no concept of what a Split actually IS - it's just an object of a specific
type with a set of pre-determined parameters.

What you describe above is an upward cascade which is already part of the
dirty flag setup and can easily be added to the *objects*. It has no part in
the marking of the QofCollection as dirty.

The Split can set the dirty flag in it's parent account (and does already) as
this pointer is part of the Split struct. The Trans, in contrast, knows
nothing about the parent account directly - it is currently notified via the
event mechanism after a call from the Split. So the tree doesn't even exist
in the source code as we conceptualise it. The Account cannot call the Trans
directly, it would have to call the Split which would call instead.

Rather than: Account -> Trans -> Split

we have: Split -> Account, Split -> Trans.

From the top:
Trans -> list of Splits in the Trans
Account -> list of Splits in the account.

So the Account can only iterate over all it's splits and the Trans can only
iterate over all it's splits. Neither can identify a single split without
iteration. The hierarchy is not symmetrical neither is does it accord with
the tree model.

You've also switched to only knowing IF there is a dirty split in the account,
not positively identifying WHICH split is dirty.

The only reason for a tree search is to find WHICH entity is dirty. Setting a
single gboolean flag is trivial.

Try this in ASCII / Fixed font display:

Split
|
Trans------------Account
| |
GList of Splits GList of Splits
in this Trans in this Account

Account
|
GList of Splits
|
g_list_foreach
|
Split
|
Trans


Trans
|
GList of Splits
|
g_list_foreach
|
Split
|
Account


That's how the source code implements the "tree".
(events notwithstanding.)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-22 01:54:45 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
David wants to put a '*' in the window title when the book is dirty.
No problem, query all the collections' dirty flag. Now, say we wanted
to extend the HIG usage to the sub windows.
We want an account window's title to have '*' is the account is dirty.
Of course, what we actually mean is "does the account contain any
dirty splits?" You may think this can be handled completely by CM
User opens an Account and adds a Trans with two splits.
Theoretically, we could catch a CM event here and mark that account
with a '*'. But, the the user jumps to the other Account in the
Transaction. This window wasn't around with the dirtying was done, so
it couldn't catch any events. But, it is dirty, so we'd want to mark
it with a '*'.
How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.
This clearly illustrates the nature of the search: It is an OBJECT issue and
it needs to be done in the objects.
Ok. as opposed to .... where?
Post by Neil Williams
There is no harm in the dirty flag being set in the s->parent and passing that
back to the trans->parent (although this isn't currently part of how a Trans
is handled and there is no trans->parent). What cannot be done is for the
engine to follow that trail backwards because it cannot know that the Split
is a child of Trans. It's just another parameter of an object that happens to
have a name string consisting of "S""p""l""i" and "t" ("\0"). The engine has
no concept of what a Split actually IS - it's just an object of a specific
type with a set of pre-determined parameters.
What you describe above is an upward cascade which is already part of the
dirty flag setup and can easily be added to the *objects*. It has no part in
the marking of the QofCollection as dirty.
Ok. Sounds good.
Post by Neil Williams
The Split can set the dirty flag in it's parent account (and does already) as
this pointer is part of the Split struct. The Trans, in contrast, knows
nothing about the parent account directly - it is currently notified via the
event mechanism after a call from the Split. So the tree doesn't even exist
in the source code as we conceptualise it. The Account cannot call the Trans
directly, it would have to call the Split which would call instead.
Rather than: Account -> Trans -> Split
we have: Split -> Account, Split -> Trans.
Trans -> list of Splits in the Trans
Account -> list of Splits in the account.
So the Account can only iterate over all it's splits and the Trans can only
iterate over all it's splits. Neither can identify a single split without
iteration. The hierarchy is not symmetrical neither is does it accord with
the tree model.
I'm not sure it's all that complicated. I think split cascades to
account, account cascades to book, and transactions can just cascade
to book, too. With a few other things cascading up to book, I think
David would have what he wants.
Post by Neil Williams
You've also switched to only knowing IF there is a dirty split in the account,
not positively identifying WHICH split is dirty.
These are quite related. Knowing the an account does or doesn't
contain a dirty split makes it much easier to find the dirty splits.
Post by Neil Williams
The only reason for a tree search is to find WHICH entity is dirty. Setting a
single gboolean flag is trivial.
That's right, and that flag makes the tree search possible.

-chris
Post by Neil Williams
Split
|
Trans------------Account
| |
GList of Splits GList of Splits
in this Trans in this Account
Account
|
GList of Splits
|
g_list_foreach
|
Split
|
Trans
Trans
|
GList of Splits
|
g_list_foreach
|
Split
|
Account
That's how the source code implements the "tree".
(events notwithstanding.)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Neil Williams
2005-07-22 09:13:55 UTC
Permalink
Post by Chris Shoemaker
Post by Neil Williams
Post by Chris Shoemaker
How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.
This clearly illustrates the nature of the search: It is an OBJECT issue
and it needs to be done in the objects.
Ok. as opposed to .... where?
The backend, the book, the engine or the session, none of which understand
anything about what you are asking.

As the tree only exists, if at all, in the UI then only the UI can handle it.
Post by Chris Shoemaker
Post by Neil Williams
So the Account can only iterate over all it's splits and the Trans can
only iterate over all it's splits. Neither can identify a single split
without iteration. The hierarchy is not symmetrical neither is does it
accord with the tree model.
I'm not sure it's all that complicated. I think split cascades to
account,
Yes, but only in the OBJECT - this has nothing to do with the engine, backend,
book or session.
Post by Chris Shoemaker
account cascades to
QofCollection *coll = qof_book_get_collection(book, GNC_ID_ACCOUNT);
Post by Chris Shoemaker
and transactions can just cascade
to their collection, (QofCollection*) ... (book, GNC_ID_TRANS)

Note, singular. One object, one collection. Without a collection, there is no
object.
Post by Chris Shoemaker
to book, too.
Nope.
Post by Chris Shoemaker
With a few other things cascading up to book, I think
David would have what he wants.
There is no need for the book to be continuously told (or asked) the same
thing, thousands upon thousands of times.

When it is asked, it finds out.

Once it has found out, the result can be cached until the UI asks for a Save.
The engine has no need to make any further information available, the answer
was given and nothing will change until the process that ASKED for the
information asks for a Save operation.

Just because the UI asks twenty thousand times, doesn't mean the book will
ever give a different answer, once it is dirty, it stays dirty until the
*user* decides otherwise.
Post by Chris Shoemaker
Post by Neil Williams
You've also switched to only knowing IF there is a dirty split in the
account, not positively identifying WHICH split is dirty.
These are quite related. Knowing the an account does or doesn't
contain a dirty split makes it much easier to find the dirty splits.
Unfortunately not. The engine cannot know that a dirty "account" (whatever
that is) means a dirty Split (whatever that might be) exists somewhere. There
is no relationship. The engine only knows that this instance is dirty, that
collection is dirty and therefore the book is dirty. End.

You continue to assert that the engine can follow a path that simply does not
exist. There is no tree!
Post by Chris Shoemaker
Post by Neil Williams
The only reason for a tree search is to find WHICH entity is dirty.
Setting a single gboolean flag is trivial.
That's right, and that flag makes the tree search possible.
It does not and cannot because no tree exists to search!!

Seeing as you've quoted this, I'll enhance it a bit:

NOTE: All paths indicated here are UNIDIRECTIONAL, from top to bottom. There
is *no* reverse call or symmetry, implied or otherwise.
Post by Chris Shoemaker
Post by Neil Williams
Split
| |
Trans------ ------Account
| |
GList of Splits GList of Splits
in this Trans in this Account
Note the disconnect between Trans and Account.

Once you've obtained a Trans from the Split, you cannot ask the Trans which
Split it came from, it doesn't know! You have to store that information from
when you first found the Split. Same with Account. Each knows their list of
Splits but it cannot tell you WHICH in that list is the one you want.

So you know an Account is dirty, big deal. There is no way the engine can tell
you WHICH split in that account is dirty without iteration. To the engine,
it's just an object and a set of parameters.
Post by Chris Shoemaker
Post by Neil Williams
That's how the source code implements the "tree".
(events notwithstanding.)
Any communication between an Account and a Trans MUST occur via the relevant
Split and that requires iteration, at the very least a qof_entity_lookup from
the GUID and GNC_ID_SPLIT. That's a GHashTable lookup, it's highly optimised
but the hashtable still needs to be initialised with iteration. Yet even this
lookup has little to do with the "tree".

Please forget the entire idea of a "tree search" - there is no tree, there
never has been.

I really wish I had said that the first time you mentioned this craziness.

Another note to self:
Listen to Derek, he's been here before and if he says there's no tree, there
simply is no tree!!!
:-)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Chris Shoemaker
2005-07-22 14:25:47 UTC
Permalink
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
Post by Chris Shoemaker
How do we dicover that it contains a dirty split? Either we have to
search through *every* split in the Collection, seeing if it's in our
account and is dirty, OR, we just check our Account's "contains
something dirty" flag. Big difference.
This clearly illustrates the nature of the search: It is an OBJECT issue
and it needs to be done in the objects.
Ok. as opposed to .... where?
The backend, the book, the engine or the session, none of which understand
anything about what you are asking.
As the tree only exists, if at all, in the UI then only the UI can handle it.
Nope. Accounts contain a list of Splits. This is an essential part
of an Account, and *something* needs to know about it and take
advantage of it. Other than the GUI.
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
So the Account can only iterate over all it's splits and the Trans can
only iterate over all it's splits. Neither can identify a single split
without iteration. The hierarchy is not symmetrical neither is does it
accord with the tree model.
I'm not sure it's all that complicated. I think split cascades to
account,
Yes, but only in the OBJECT - this has nothing to do with the engine, backend,
book or session.
Post by Chris Shoemaker
account cascades to
QofCollection *coll = qof_book_get_collection(book, GNC_ID_ACCOUNT);
Post by Chris Shoemaker
and transactions can just cascade
to their collection, (QofCollection*) ... (book, GNC_ID_TRANS)
Note, singular. One object, one collection. Without a collection, there is no
object.
Post by Chris Shoemaker
to book, too.
Nope.
Post by Chris Shoemaker
With a few other things cascading up to book, I think
David would have what he wants.
There is no need for the book to be continuously told (or asked) the same
thing, thousands upon thousands of times.
You could say the same thing about the Collection. But actually, there is just as much reason in both cases.
Post by Neil Williams
When it is asked, it finds out.
Once it has found out, the result can be cached until the UI asks for a Save.
The engine has no need to make any further information available, the answer
was given and nothing will change until the process that ASKED for the
information asks for a Save operation.
Just because the UI asks twenty thousand times, doesn't mean the book will
ever give a different answer, once it is dirty, it stays dirty until the
*user* decides otherwise.
Post by Chris Shoemaker
Post by Neil Williams
You've also switched to only knowing IF there is a dirty split in the
account, not positively identifying WHICH split is dirty.
These are quite related. Knowing the an account does or doesn't
contain a dirty split makes it much easier to find the dirty splits.
Unfortunately not. The engine cannot know that a dirty "account" (whatever
that is) means a dirty Split (whatever that might be) exists somewhere. There
is no relationship. The engine only knows that this instance is dirty, that
collection is dirty and therefore the book is dirty. End.
You missed the point here. Forget about the engine. This was a
*mathematical* statement about searches. Having information about the
presence (or absence) of what you're searching for in a subset of all
the places you could look at a cost less than the cost of actually
looking in all those places makes your search cheaper. This has
nothing to do with everything that the "engine cannot know."
Post by Neil Williams
You continue to assert that the engine can follow a path that simply does not
exist. There is no tree!
No. I made no assertion about *who* can follow that path. But the
path does and must exist and we use it already to accomplish most of
the financial operations.
Post by Neil Williams
Post by Chris Shoemaker
Post by Neil Williams
The only reason for a tree search is to find WHICH entity is dirty.
Setting a single gboolean flag is trivial.
That's right, and that flag makes the tree search possible.
It does not and cannot because no tree exists to search!!
NOTE: All paths indicated here are UNIDIRECTIONAL, from top to bottom. There
is *no* reverse call or symmetry, implied or otherwise.
Post by Chris Shoemaker
Post by Neil Williams
Split
| |
Trans------ ------Account
| |
GList of Splits GList of Splits
in this Trans in this Account
Note the disconnect between Trans and Account.
You're really hung up on this. Transactions and Accounts aren't
directly related. Neither are Budgets and Vendors. So what?

-chris
Post by Neil Williams
Once you've obtained a Trans from the Split, you cannot ask the Trans which
Split it came from, it doesn't know! You have to store that information from
when you first found the Split. Same with Account. Each knows their list of
Splits but it cannot tell you WHICH in that list is the one you want.
So you know an Account is dirty, big deal. There is no way the engine can tell
you WHICH split in that account is dirty without iteration. To the engine,
it's just an object and a set of parameters.
Post by Chris Shoemaker
Post by Neil Williams
That's how the source code implements the "tree".
(events notwithstanding.)
Any communication between an Account and a Trans MUST occur via the relevant
Split and that requires iteration, at the very least a qof_entity_lookup from
the GUID and GNC_ID_SPLIT. That's a GHashTable lookup, it's highly optimised
but the hashtable still needs to be initialised with iteration. Yet even this
lookup has little to do with the "tree".
Please forget the entire idea of a "tree search" - there is no tree, there
never has been.
I really wish I had said that the first time you mentioned this craziness.
Listen to Derek, he's been here before and if he says there's no tree, there
simply is no tree!!!
:-)
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
_______________________________________________
gnucash-devel mailing list
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Derek Atkins
2005-07-22 14:43:39 UTC
Permalink
[ please trim replies. these messages are getting pretty long. thanks ]
Post by Chris Shoemaker
Nope. Accounts contain a list of Splits. This is an essential part
of an Account, and *something* needs to know about it and take
advantage of it. Other than the GUI.
The Account Object Definition takes care of it. The implementation
of the Account Object does it.
Post by Chris Shoemaker
Post by Neil Williams
Unfortunately not. The engine cannot know that a dirty "account" (whatever
that is) means a dirty Split (whatever that might be) exists somewhere. There
is no relationship. The engine only knows that this instance is dirty, that
collection is dirty and therefore the book is dirty. End.
You missed the point here. Forget about the engine. This was a
*mathematical* statement about searches. Having information about the
presence (or absence) of what you're searching for in a subset of all
the places you could look at a cost less than the cost of actually
looking in all those places makes your search cheaper. This has
nothing to do with everything that the "engine cannot know."
True, but caching this information (which is effectively what you're
doing) comes at a cost. You need to store extra data somewhere
(increase storage cost) in order to reduce (time) cost of a search.
All well and good, provided you know a priori which searches you need
to optimize.

QOF is a general search engine and really does NOT understand some of
the optimizations that can be made. For example, we actually lost a
particular optimization in the move from a "Search Accounts for
Splits/Transactions" to QOF: we lost the ability to reduce the
search-time by limiting the search to only Splits in particular
Accounts. This lossage happened necessarily because QOF does not
understand that Accounts contain Splits.

_Accounts_ know that Accounts contain Splits, but QOF does not... And
it's QOF that performs that search.

Now, if QOF were extended in such a way that these relationships could
be declared, it may be possible to regain that opimization. Maybe.
Post by Chris Shoemaker
Post by Neil Williams
You continue to assert that the engine can follow a path that simply does not
exist. There is no tree!
No. I made no assertion about *who* can follow that path. But the
path does and must exist and we use it already to accomplish most of
the financial operations.
And see, that is exactly the issue. There's an abstraction barrier
which you want to cross. :) Yes, the information does exist, but
it's not available at all levels.

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Chris Shoemaker
2005-07-22 19:23:13 UTC
Permalink
Post by Derek Atkins
Post by Chris Shoemaker
Post by Neil Williams
Unfortunately not. The engine cannot know that a dirty "account" (whatever
that is) means a dirty Split (whatever that might be) exists somewhere. There
is no relationship. The engine only knows that this instance is dirty, that
collection is dirty and therefore the book is dirty. End.
You missed the point here. Forget about the engine. This was a
*mathematical* statement about searches. Having information about the
presence (or absence) of what you're searching for in a subset of all
the places you could look at a cost less than the cost of actually
looking in all those places makes your search cheaper. This has
nothing to do with everything that the "engine cannot know."
True, but caching this information (which is effectively what you're
doing) comes at a cost. You need to store extra data somewhere
(increase storage cost) in order to reduce (time) cost of a search.
All well and good, provided you know a priori which searches you need
to optimize.
Updating a boolean in the account doesn't cost any more than updating
a boolean in the collection. (Of course, you'd probably want both, so
it'll cost twice.)
Post by Derek Atkins
QOF is a general search engine and really does NOT understand some of
the optimizations that can be made. For example, we actually lost a
particular optimization in the move from a "Search Accounts for
Splits/Transactions" to QOF: we lost the ability to reduce the
search-time by limiting the search to only Splits in particular
Accounts. This lossage happened necessarily because QOF does not
understand that Accounts contain Splits.
_Accounts_ know that Accounts contain Splits, but QOF does not... And
it's QOF that performs that search.
Little red flags just popped up! I know that QOF offered generalized
search and that's powerful, but let me just think (out loud) for a sec
about what a financial app like Gnucash actually need to do.

What's one of the most common operations? Maybe opening a view of all
the splits in an account, viewed as transactions. Therefore, what's
probably the *most common* query? I'm guessing it's probably the
query that finds all the splits in an account. That query is probably
run 100 times more frequently than any other.

What's the most common object type? Probably splits, there's probably
10 times more splits than the next most common object (probably
Transactions). ( I have mostly Transactions with many splits, but
for a different user, this may only be more like 3 times.)

So, the by-far most common query has to iterate over the most common
object ever time we open or refresh a register. And even though the
application-specific, non-generic, financial-logic-containing Account
objects have exactly the list we need already stored, we want to use
the generic, powerful, QOF-Query that's so flexible but has to iterate
over every split in my book and check its Account just to return the
list that the Account already had!

Please, seriously, please tell me I'm making all this up.

Here's my take on this: We shouldn't be constrained from using the
relationships between financial objects just because some generic
library can't interpret them. Use the library for what it can do well
(storage? generalized search?). Use the application domain
relationships in the application where it makes sense.

Implications? 1) My re-written register will allow "anchored" account
cases where QofQuery is not even used, along with ones where it is.
2) I don't see any problem at all with dirtiness propagating back to a
flag in the book where ever containment relationships exist among the
financial objects.

-chris
Derek Atkins
2005-07-22 14:23:33 UTC
Permalink
Post by Chris Shoemaker
Post by Neil Williams
So the Account can only iterate over all it's splits and the Trans can only
iterate over all it's splits. Neither can identify a single split without
iteration. The hierarchy is not symmetrical neither is does it accord with
the tree model.
I'm not sure it's all that complicated. I think split cascades to
account, account cascades to book, and transactions can just cascade
to book, too. With a few other things cascading up to book, I think
David would have what he wants.
What about Customers? Invoices? PriceDB Entries? SXes? Commodities?

There are lots of objects in the database that can be touched/modified
that don't fall into the CoA tree structure. Please don't limit
yourself to thinking only about the CoA.

Honestly, I really don't think we don't need to know which objects are
dirty. I just don't see that as a requirement for anything we're
doing at the moment, or in the future. Besides, if we wanted to, we
could just create a second HashTable in each Collection and put a
reference to each committed/changed object into that second HashTable.
It means we'd effectively need twice the amount of metadata storage,
but I don't think those hash tables really take up a lot of space.

However, I still don't think we need that at the moment.

-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
Chris Shoemaker
2005-07-22 14:31:48 UTC
Permalink
Post by Derek Atkins
Post by Chris Shoemaker
Post by Neil Williams
So the Account can only iterate over all it's splits and the Trans can only
iterate over all it's splits. Neither can identify a single split without
iteration. The hierarchy is not symmetrical neither is does it accord with
the tree model.
I'm not sure it's all that complicated. I think split cascades to
account, account cascades to book, and transactions can just cascade
to book, too. With a few other things cascading up to book, I think
David would have what he wants.
What about Customers? Invoices? PriceDB Entries? SXes? Commodities?
Most branches are probably shallow. If an object isn't contained by
anthing other than the book, then it's contained by the book.
Post by Derek Atkins
There are lots of objects in the database that can be touched/modified
that don't fall into the CoA tree structure. Please don't limit
yourself to thinking only about the CoA.
Honestly, I really don't think we don't need to know which objects are
dirty. I just don't see that as a requirement for anything we're
doing at the moment, or in the future. Besides, if we wanted to, we
could just create a second HashTable in each Collection and put a
reference to each committed/changed object into that second HashTable.
It means we'd effectively need twice the amount of metadata storage,
but I don't think those hash tables really take up a lot of space.
However, I still don't think we need that at the moment.
I agree that we don't currently (and probably never will) need to
track a list of *references* to dirty objects. But, a boolean flag
that propagates back to book might be more useful. And if we ever
*do* need to find the dirty instances, the boolean flag makes the
search much easier.

-chris
Post by Derek Atkins
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
Derek Atkins
2005-07-22 14:55:25 UTC
Permalink
Post by Chris Shoemaker
Post by Derek Atkins
What about Customers? Invoices? PriceDB Entries? SXes? Commodities?
Most branches are probably shallow. If an object isn't contained by
anthing other than the book, then it's contained by the book.
You're trying to break the QOF object abstraction. I think that's what Neil is
complaining about.
Post by Chris Shoemaker
Post by Derek Atkins
However, I still don't think we need that at the moment.
I agree that we don't currently (and probably never will) need to
track a list of *references* to dirty objects. But, a boolean flag
that propagates back to book might be more useful. And if we ever
*do* need to find the dirty instances, the boolean flag makes the
search much easier.
And we already have that in the Collection Metadata. Each Collection has a
boolean flag that says "has anything in this collection changed?" QOF knows
about collections so it could drill down inside a collection if we really
needed to. QOF, however, doesn't know that Accounts contain Splits, and
there's no way in the QOF Definition to say that in a way that tells QOF to use
that relationship instead of the Collections.

So we can already do this, indeed, we DO already do this now. Just not the way
you want ;)
Post by Chris Shoemaker
-chris
-derek
--
Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
Member, MIT Student Information Processing Board (SIPB)
URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH
***@MIT.EDU PGP key available
David Hampton
2005-07-20 19:51:02 UTC
Permalink
Post by Neil Williams
?? All core structs contain a QofInstance which itself contains a QofEntity.
Did QoF get extended to commodities and prices?

David
Neil Williams
2005-07-20 20:17:00 UTC
Permalink
Post by David Hampton
Post by Neil Williams
?? All core structs contain a QofInstance which itself contains a QofEntity.
Did QoF get extended to commodities and prices?
Not yet, but I do have ideas on how to do it and the CLI will be the testing
ground.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
Loading...