I made good progress with flattening over the last couple days.
On Saturday, we had a party. Before beginning his 90 minute drive home, Carlos humored me by looking at the in-progress web interface. I demonstrated how flattening would be useful there.
Carlos assumed that I was going to add a key word to the query to indicate that it should come back flat. My immediate response was that flat indicated just the desired output structure of the data, and really wasn’t part of the query itself. My plan was to add an OPTIONS method parameter and one parameter would be flattening indicator.
But, the suggestion stuck, and the more I thought about it, the more I liked it. As for my argument: screw it. SQL Server can return xml by adding formatting instructions to the query. Tossing the option into the command is convenient.
from r in MovieReview
select r.ReviewDate, r.Movie.ProductName, r.Author.ScreenName
where r.Movie.ProductName contains 'season'
orderby r.Author.ScreenName ascending, r.ReviewDate descending
There it is. The new FLAT keyword.
Non-flat noodb queries require two passes. The first pass builds up a CTE containing all of the joins necessary to run the WHERE and ORDER BY. That query just returns the object ids. The second pass builds all of the objects that came back in the list.
The easy way to FLAT would’ve been to continue doing it the same way, but it would be wasteful. The CTE that’s used to determine the object IDs can return the values at the same time. It’s more efficient to get it right up front.
I added a new method to IObjectDataContext called GetObjectData(). The non-flat method is GetObjectIds(). In the implementation, the code is pretty much the same. The new method adds fields to the select. The old one does not.
The return value of GetObjectData is a DataTable, which I have mixed feelings about, but I’m going with it at least for now. The server then converts the data table to a collection of FlatParts. Each flatpart has an array of values. I need to add a header row with the column information. (I could add it to every row, but that would be lazy. It would be way too much repetitive data.)
Also, there’s one thing I’m going to need that’s probably not going to work. When you do the SELECT statement now, you can specify * or ** for nested objects.
from r in MovieReview select r.Movie.**
That means get the Movie object.
It specifically does not support
from r in MovieReview select r.Movie
There’s wasn’t a point to that. Now, there may be. When flattening the result set, maybe you want the Object Id to come back as part of the query. That desire can’t be inferred, it’d have to be explicit. So, I can do it as shown above, or I can do
from r in MovieReview select r.Movie.ObjectId
But, that’s inconsistent since movie doesn’t have a property called object id. (But what if it did? Conflict!) I’d want to be able to get that info back somehow (in flat only) for linking purposes.
My final thought for the evening: above, I described the 2-pass approach. Some of you, such as me, may wonder why to bother with that second pass. You can use the same logic as the flat approach to justify pulling back the data in the first query.
I didn’t do that for a few reasons. The main one is that we’re dealing with object hierarchies, so getting back all that data flat is complicated. It’s mostly done; it needs to be to support the WHERE and ORDER BY, but it doesn’t work on collections. It’d be big and hairy, but would work as long as you eliminate collections from the mix.
The second reason is that the plan is to cache the objects and not hit the database the second time. Theoretically, some day, it will just get the ids then return the objects. Someday beyond that, it might not even have to hit the database to get the object ids. It can do a query against memory.
In conclusion: Flatting is almost there. It won’t work for collections (yet). I just have to update the page to recongnize the new FlatPart object.