|
Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Julian Buss, September 16th, 2010 10:15:13
Tags: LotusNotes852
Forgive me if this is common knowledge and I'm just dumb, perhaps I don't see the obvious.
Assume you have a very simple view, just one column named "key", sorted ascending. The view contains about 80.000 documents. 10 of these documents have different keys, 79.990 documents have the same key (let's say "mytestkey"). Now run this code: set doc = view.getDocumentByKey("somekey") set doc = view.getDocumentByKey("mytestkey") You will notice that the first call is very fast. But the second call takes ages (about 4 seconds in my test case). It's only a minor difference if I use .getDocumentByKey() or .getEntryByKey(). I just want to have the first document with the key. Does anyone know why these methods are so slow when there are many documents having the same key? Update: it's clearly reproducible, even when used local. Here is a demo database, download, open and run the agent 'performance test'. Update 12. oct 2010: IBM is working on the problem and a fix is being tested.
Comments (16) | Permanent Link
1) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
I imagine, as you want one document, that Domino has to find the "first" one matching the key and may trigger some "big" scanning. Is there a second sort column ? (maybe it can help) 2) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
no, no second sort column yet, but I will test that, thanks for the hint. 3) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Is this running between a client and server, or on a server only, or a client only? GDBK *should* be very fast. GetEntryByKey should be the same speed, unless there are many children for the entry (responses in your case, since this is a document.) Have you tried this against 8.5.1 or an earlier version of Domino? 4) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
running between client and server, no responses at all. Tried with 8.5.2 (client and server). 5) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
That's definitely odd. Can you try against an earlier version? If not, can you email me the DB and I'll try it for you? 6) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Erik, it's clearly reproducible. I sent you a demo db. 7) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Not odd. getEntryByKey() directly reads the view data into afaik resultbuffer which needs to be extended when full (after all it presumes a few hits only for a "key"). The documentbykey initializes all document objects. So entry should be faster but might run foul of buffer. Did u compare to getAllEntriesByKey() ? 8) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Stephan, the getEntrybyKey() method is nearly as slow as getDocumentbyKey! In my test this was only a second or so difference. I expect that the view's index is only read until the first hit is found (which would be fast), but this seems not to be the case. 9) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
It's much worse than you think. The performance benchmarks are all out of whack in ways that make no sense. Getting single element returns is pretty much just as expensive as getting the entire collection, and getting the entry collection is actually MORE expensive than getting the document collection. Don't know if it'll come through, but here's the DXL for my refactoring of Julian's performance test agent. <?xml version='1.0' encoding='utf-8'?> <!DOCTYPE agent SYSTEM 'xmlschemas/domino_8_5_2.dtd'> <agent name='performance test' xmlns='{ Link } version='8.5' maintenanceversion='2.0' replicaid='C12577A000442E3D' hide='v3' publicaccess='false' designerversion='8.5.2'> <trigger type='actionsmenu'/> <documentset type='runonce'/><code event='options'><lotusscript>'/* ' * Agent create documents ' * Created Sep 16, 2010 by Julian Buss/YouAtNotes ' * Description: Comments for Agent ' */ Option Public Option Declare %Include "LSCONST.LSS" </lotusscript></code><code event='declarations'><lotusscript>Dim session As NotesSession Dim view As NotesView </lotusscript></code><code event='initialize'><lotusscript> Sub Initialize On Error GoTo errorHandler Set session = New NotesSession Set view = session.currentDatabase.getView("($SystemDocs)") Dim doc As NotesDocument Dim entry As NotesViewEntry Dim docColl As NotesDocumentCollection Dim entryColl As NotesViewEntryCollection Print "START TICKS " + CStr(GetThreadInfo(LSI_THREAD_TICKS)) ExecuteLookup "Document", "aaa" ExecuteLookup "Document", "mytestkey" ExecuteLookup "Entry", "aaa" ExecuteLookup "Entry", "mytestkey" ExecuteLookup "AllDocuments", "aaa" ExecuteLookup "AllDocuments", "mytestkey" ExecuteLookup "AllEntries", "aaa" ExecuteLookup "AllEntries", "mytestkey" Print "END TICKS " + CStr(GetThreadInfo(LSI_THREAD_TICKS)) errorExit: Exit Sub errorHandler: Print CStr(Erl)+" "+Error$ Resume errorExit End Sub </lotusscript></code><code event='ExecuteLookup'><lotusscript>%REM Sub ExecuteLookup Description: Comments for Sub %END REM Sub ExecuteLookup (LUType As String, LUKey As String) Dim doc As NotesDocument Dim entry As NotesViewEntry Dim docColl As NotesDocumentCollection Dim entryColl As NotesViewEntryCollection Dim startTicks As Long Dim incremTicks As Long startTicks = GetThreadInfo(LSI_THREAD_TICKS) Select Case LUType Case "Document": Set doc = view.Getdocumentbykey(LUKey, True) Case "Entry": Set entry = view.Getentrybykey(LUKey, True) Case "AllDocuments": Set docColl = view.Getalldocumentsbykey(LUKey, True) Case "AllEntries": Set entryColl = view.Getallentriesbykey(LUKey, True) End Select incremTicks = GetThreadInfo(LSI_THREAD_TICKS)-startTicks Print "Performed lookup type " + LUtype + " using key '" + LUKey + "' in " + CStr(incremTicks) + " ticks" End Sub</lotusscript></code> <rundata processeddocs='0' exitcode='0' agentdata='B56F56BC04A48C96852577A0004D39B1'> <agentmodified><datetime dst='true'>20100916T100324,91-04</datetime></agentmodified> <agentrun><datetime dst='true'>20100916T100428,43-04</datetime></agentrun> <runlog>Started running agent 'performance test' on 09/16/2010 10:04:19 AM Ran LotusScript code Done running agent 'performance test' on 09/16/2010 10:04:28 AM </runlog></rundata></agent> 10) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
strange, indeed. I cannot believe that this works as designed. One could argue that the usecase is not very common (why should one have thousands of documents with the same key?) but neverless, it can happen and it feels like a bug that the lookup methods behave this way. 11) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
FYI, coded up the test in Java and it comes back with identical results.... Performed lookup type Document using key 'aaa' in 0 ms Performed lookup type Document using key 'mytestkey' in 2028 ms Performed lookup type Entry using key 'aaa' in 0 ms Performed lookup type Entry using key 'mytestkey' in 1763 ms Performed lookup type AllDocuments using key 'aaa' in 0 ms Performed lookup type AllDocuments using key 'mytestkey' in 1856 ms Performed lookup type AllEntries using key 'aaa' in 0 ms Performed lookup type AllEntries using key 'mytestkey' in 3183 ms END TEST 8845 12) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Nathan, if you have some additional minutes to spent: what if the tests are performed by the XPages engine? Since it's java, too, it SHOULD return the same results. But instinct tells me it could be faster there for some reason. 13) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Also, no measurable difference when the view's key column is categorized. I agree this seems to be not as designed, but there is a certain logic to it. The NIFFindByKey function in the C API returns the CollectionPosition and number of matches whether the calling function only cares about the first match or not. So it's only after the API call to find a position in the index is executed that the calling code must decide whether to loop through NIFReadEntries or simply get the one entry at the position and stop. What would seem to be comparatively expensive, given the testing numbers, is discovering that number of matches. Someone probably thought it was inexpensive to count the matches early on in the API. Perhaps it used to be stored in the serialization or something. But clearly getting the counts with tens of thousands of matches takes a lot longer than getting the counts with tens of matches. It would seem an easy and cheap performance improvement to add a flag or extend NIFFindByKey to tell it to not return the number of matches, and then to switch the LS/Java APIs to use that version of the Find when only requesting a single document. But I don't know if IBM will care to do it. 14) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
Solution: Use ViewNavigator. View.createNavFromCategory(key) followed by .getFirstEntry() is EXTREMELY fast. Performed lookup type Document using key 'aaa' in 7 ms Performed lookup type Document using key 'mytestkey' in 2001 ms Performed lookup type Entry using key 'aaa' in 1 ms Performed lookup type Entry using key 'mytestkey' in 1637 ms Performed lookup type AllDocuments using key 'aaa' in 0 ms Performed lookup type AllDocuments using key 'mytestkey' in 1742 ms Performed lookup type AllEntries using key 'aaa' in 1 ms Performed lookup type AllEntries using key 'mytestkey' in 3073 ms Performed lookup type Navigator using key 'aaa' in 1 ms Performed lookup type Navigator using key 'mytestkey' in 1 ms END TEST 8469 15) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
ah yes, the ViewNavigator, I almost forgot about it. Thanks for the hint, it's a good solution. 16) Performance issue of NotesView.getDocumentByKey() or .getEntryByKey() - why?
one more addition for other readers: NotesView.createViewNavFrom(key).getFirst().document works with uncategorized views, too, but is as slow as getDocumentbyKey() then. That means you need to categorize your view. And if you do that, you need NotesView.createViewNavFrom(key).getFirstDocument().document (instead of getFirst()). And I can confirm it, using the view navigator is very fast (the lookup runs almost instantly). So it's a workaround if you can categorize your view. Nathan, thank you very much for all your efforts! |
