Announcement

Collapse
No announcement yet.

Sellerdeck database

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Sellerdeck database

    So, the question I have is when is Sellerdeck going to make the switch to using a proper database on the web server?

    We all understand the issues. A proper, structured database is designed to respond very quickly and efficiently to queries and return only the desired data. It can also be used to return the desired data set without having to define all the 'filtering' beforehand. Whether that be price, brand, size, weight, colour, function, etc, etc.

    In a reply on the 'Slow Ajax thread' the Sellerdeck CTo, Hugh Ginson has just posted;

    Regarding the other comments: What is a database? A collection of data. We have that on the website, in a structured association of keys and values. We use that to drive filtering and it works very quickly when it's set up correctly.
    I'm sorry. But what sellerdeck are using right now is not a proper database driven backend for the website. It's really a half way hack and is full of compromises.

    I'm sorry if my terminology upsets some of the people at sellerdeck but I hope they can understand my frustration when what should be a fairly simple thing to do properly becomes a 3 release saga because of short termism driving development down side paths.

    Please, don't tell me that this is the future going forward for sellerdeck.

    Anyone?

    Mike
    -----------------------------------------

    First Tackle - Fly Fishing and Game Angling

    -----------------------------------------

    #2
    We all understand the issues. A proper, structured database is designed to respond very quickly and efficiently to queries and return only the desired data.
    So you are wanting easy access to a SQL database?

    I've written a couple of heavily used websites in .NET with SQL Server database backend, so I'm very familiar with the concepts.

    But what sellerdeck are using right now is not a proper database driven backend for the website. It's really a half way hack and is full of compromises.
    I would be interested to know your view on these "compromises". My view is that they are full of optimisations. I'll be happy to have a chat with you about that..

    If we did use a SQL database we would still hide it behind an API, preventing SQL Injection attacks. We would also carefully define the structure and indexes so that all the queries worked efficiently (the desktop database does not have a suitable structure for putting on the server). We wouldn't be enabling designers to write general-purpose queries.

    If we use an API, we're free to use whatever data format we want - as long as you get your data in good time. So what's the point of putting the data in a SQL database if we can write the API you want using our current index files?

    So what do you want? The ability to find all products matching a particular set of criteria? That's just about what the Perl scripts do now when filtering - though the attribute/variable names use internal codes rather than plaintext. The choices are not coded. So it's not much of an extension to make it easier to create ad-hoc queries for products with plaintext attribute/variable names.

    Filtering is a feature which we always knew would be used in ways we hadn't anticipated. We took a long time designing it, and had many ideas of possible features. In the end we decided on a core set of features that we felt were flexible enough for most purposes. As it is being used more and more we are finding other requirements. It's up to you, the users, to shout loudly for the features you want.
    Hugh Gibson
    CTO - Sellerdeck, part of ClearCourse

    Comment


      #3
      I'd be very surprised to hear anyone explaining how the the numerous .CAT files (huge lumps of Perl containing hashes of product data organised per Section) are the best way to hold data on the server.

      Such a technique cannot scale well (especially for large multi Product Sections) as the entire lump has to be loaded into server program memory just to access a single products pricing info.

      It's not 1999 (Actinic V3 era) any more, when Perl was perhaps the only widespread scripting language and there wasn't much in the way of database systems to choose from.
      Norman - www.drillpine.biz
      Edinburgh, U K / Bitez, Turkey

      Comment


        #4
        I'd be very surprised to hear anyone explaining how the the numerous .CAT files (huge lumps of Perl containing hashes of product data organised per Section) are the best way to hold data on the server.
        They're certainly not the best way when filtering, which is why filtering is implemented using two index files - which are much faster to access. The first index file finds products and sort orders that match the filtering criteria. The second index file finds details of those products given the product reference. Hence the two .js (were .json) files in the filtering cache.

        We're adding more product details in the second index file as we develop filtering.

        The .CAT files have their place, and limitations. They perform pretty well for the functionality in there. We don't want to rewrite something just for the sake of it, as that risks destabilising that area.

        We do rewrite when performance is an issue. That's what led to the filtering cache. Also, when implementing filtering, the format of the word index file was changed, giving a x5 improvement in accessing it on the server.
        Hugh Gibson
        CTO - Sellerdeck, part of ClearCourse

        Comment


          #5
          So what's the point of putting the data in a SQL database if we can write the API you want using our current index files?
          To me it's very simple. The current index files are a subset of the data that has to be defined in advance. This means that it's unable to cope with changing data (such as 'in stock') and get's messy if you start to try and filter on search queries or refine filter results in an advanced way.

          We're adding more product details in the second index file as we develop filtering.
          Which is exactly the point I was making initially about a short term solution getting more complicated as more shortcomings are exposed and more features are added. In the end it all gets so complicated that everyone wishes it had just been done right in the first place.

          I would be interested to know your view on these "compromises".
          The things that come to mind (not having used filtering myself) are:

          1. Not being able to filter on 'in stock' without having to set up a new variable and then somehow keep that consistent with the actual stock level.

          2. Having to manually define the values of the filtered variables. So if my products are all called 'Dunlop Winter Tyres 18/50 VR' etc then it would be great if I could just set up my filter variable along the lines of (Dunlop = Has 'Dunlop' in 'Product Name')

          3. Combining search and filter results.

          On site Search is increasingly the way of finding the right products and we need to be able to filter and re-sort on the search results (and vice versa)

          - To filter a search result to only show results from a certain brand.
          - To refine a filter result to only show ones with 'red'
          - To sort filtered searches by 'popularity', 'price', 'New Products', etc

          4. Large amounts of data being uploaded when lots of results in order to filter/refine/re-sort or whatever on the web page.

          As I said before though, my view on this is fairly simple:

          a) Search and filtering are part and parcel of the same thing and need to work well together.

          b) There's a lot product data contained in the database that can be used for advanced search and filtering options. Trying to re-create all this in proprietary files is going to be slow and pre-indexing things is going to cause consistency issues and won't work when a new refinement is added by the customer.

          c) standard databases have been developed and optimised for years to give the fastest performance and include all the features that are needed for flexible access to the data. Trying to do this yourself is a waste of development resources, will take longer to implement, will be slower in performance and inevitably won't have a feature that's needed because 'we didn't think it would be important.'

          d) Transferring the data from the local database to an online one is dead simple and then you have the flexibility to access it as you desire. I have no desire to access it myself (although others might) and an API is fine. (although the ability to generate sales data and update stocks online would be very helpful for anyone who's selling across multiple channels and would be much easier for sellerdeck / Contractor / Third party to develop if a standard database is used. - Again, one of the advantages of using a complete standard database rather than proprietary stuff for bits and pieces on a here and there basis).

          Mike

          PS. Another example of filtering / searching not well served at the moment would be a 'bought before' feature for customers with accounts. Maybe sorted by 'most recent'.

          I'm sure there are lot's of these kinds of features users will want that are easy to access and define once the data is readily available. It's all much easier to do if the data is held there in the backend and all that's needed is a new query to access it and a variable set created for definition and display.
          -----------------------------------------

          First Tackle - Fly Fishing and Game Angling

          -----------------------------------------

          Comment


            #6
            Having let my thoughts settle on this a bit. Here's what I think my requirements are:

            1. For customers to be able to 'Search' > 'Filter' > 'Sort' seamlessly in varying orders, but particularly in this one which makes most sense to me. Filters should also work at muliple levels. i.e. Search = 'Widgets', Filter1 = 'Red', Filter2 = 'Large', Filter3 = In Stock, Sort by = Popularity.

            2. For all potentially relevant data to be available at each function (search, filter, sort). Including things like: in Stock, brand, bought before, hidden fields, popularity, most recent, best reviews, section, price, more than n% off RRP, etc.

            3. For the data to be returned, sorted and displayed quickly, regardless of the number of items in the results. To be able to display a fixed number of results per page.

            4. For the system to be flexible enough so requests for new features can be quickly and easily implemented without Sellerdeck having to do a complete redevelopment. For example so that customers with accounts can filter on 'recently bought'.

            5. For the system to be flexible enough that Sellerdeck / Contractor / Third Party can use it to develop new features that are already / will be required in the forseeable future. Such as real time stock management and price management across multiple channels.

            These all seem perfectly reasonable to me as a requirement. They all also seem to point in the direction of an online backend database.

            Mike

            PS: I'd also like filter values to be able to 'self match' based on defined criteria such as 'contains keyword in Product Name'. etc. I'm sure others will have similar requirements once they think about it / start using it. Hence the need for item 4.
            -----------------------------------------

            First Tackle - Fly Fishing and Game Angling

            -----------------------------------------

            Comment


              #7
              Originally posted by Mike Hughes View Post
              not having used filtering myself
              It scares the life out of me the inbuilt filtering so I also stay well clear.

              I set up my own filtering using a lightweight jQuery system http://www.wineschoppen.co.uk/acatal...packaging.html ... which excludes products if not available to buy online. It could easily work to block if test stock levels but the site does not use stock control.

              Slightly more sexy as well


              Bikster
              SellerDeck Designs and Responsive Themes

              Comment


                #8
                How many lines of code was yours John?

                My Sorted Products / Sections / Search add-on ( V7 through V11) was 200 lines of jQuery based JavaScript (and it also did filtering - but not as comprehensively as SD 2013 does). However you could easily understand and tweak it.

                SD's stuff clocks in at approximately 7,000 lines of JavaScript and 5,000 lines of Perl, not counting the massive supporting data structures - what can possibly go wrong?
                Norman - www.drillpine.biz
                Edinburgh, U K / Bitez, Turkey

                Comment


                  #9
                  A meagre 15kb minified plus the call to jQuery ... less is more or so Google Webmaster keeps telling me!


                  Bikster
                  SellerDeck Designs and Responsive Themes

                  Comment


                    #10
                    @Jont:

                    I set up my own filtering using a lightweight jQuery system
                    There are some major limitations compared with inbuilt filtering. The main one is that it only works with products in the current page. It can't pull in products from elsewhere in your catalogue. Do you have lots of duplicate products to drive Christmas etc?

                    There are no counts against choices either.

                    Slightly more sexy as well
                    No arguments there

                    @Norman:

                    SD's stuff clocks in at approximately 7,000 lines of JavaScript and 5,000 lines of Perl,
                    Some comments about the source code size:
                    • the Perl code was changed for filtering, but not by as much as 5000 lines. Most of the code was already there for searching, including the little-used feature to filter search results.
                    • The Perl was needed as a backup to the filtering in the JavaScript. It's possible to do filtering and sorting with JS disabled (though not all functionality is present).
                    • We chose not to use jQuery to limit external dependencies. That would have reduced the size of our JS.
                    • Getting counts for choices is standard in any ecommerce site, and took a lot of code. We worked hard to make that efficient which may have increased code size.
                    • Getting Search Word relevance information required changes to indexing in the desktop, format changes in the index files, changes to the Perl and so on. These things are complex and can't be implemented in a few lines of code.


                    Yes, the code does have its moments but when you have something working and tested, you avoid tweaking it: it may break things and there are better things to do.

                    not counting the massive supporting data structures
                    The data structures haven't changed much from earlier versions. We actually eliminated one of them - the list of products in sorted price order. That's handled by the Word Index file now. The Word Index file is also used for searching and filtering.

                    what can possibly go wrong?
                    Yes, it's lots of code. We gave it lots of testing prior to release, but there were bugs. We've fixed some of those in the maintenance release, and have more planned for the next maintenance release. It's like any new feature - there are problems that slip through.

                    I suppose my attitude to 10,000 lines of code is different to yours - we have over 2 million LOC in the desktop.

                    @Mike:

                    Thanks for those thoughts. I'll refer them to Bruce, and spend some more time thinking about your comments.

                    We're pretty sure that what we have now enables setting up shops that are as good as the major online shops out there. Some of your ideas may take those shops to a new level, which is where we want to be.

                    One thought: you're pretty enthusiastic about getting the offline database up to the server. However, even if we did that we would still have to do a major amount of work to get the data in a form suitable to drive the website. For example:
                    • The Word Index calculates weighting for words by analysing the products. Therefore if you search for "Green Bag", products with "Green" and "Bag" in the title will appear before products with just "Green" or just "Bag". That sort of thing is difficult to do efficiently in a query using the data as it exists at the moment.
                    • We get all sort orders downloaded to the desktop so sorting and pagination can be done in the desktop. Getting products in one sort order is easy when doing a query, but getting all sort orders at once is not.
                    • Filtering treats Attributes/Choices and Variables/Values as equivalent. They would have to be merged prior to running any query (or joined in the query, making it more complex).



                    I'm not saying our approach is "right" from some sort of doctrinal head-in-the-sand point of view. As I said, I have experience of other models. However, in software development we have to get a return on investment. So we implement features that have been requested or we feel will be useful. If something is going to take many months with little gain, we don't consider it. However, if we can see the benefit then we look into it.
                    Hugh Gibson
                    CTO - Sellerdeck, part of ClearCourse

                    Comment


                      #11
                      Originally posted by Hugh Gibson View Post
                      There are some major limitations compared with inbuilt filtering. The main one is that it only works with products in the current page. It can't pull in products from elsewhere in your catalogue. Do you have lots of duplicate products to drive Christmas etc?
                      That page works on a modified ProductList which automatically populates products depending on the variable declarations. As far as the database is concerned no duplicates are being used at all so the overhead is very small.

                      It is really not that much different in operating terms from the SellerDeck version which would require the user to click on card, then blue, then 2 bottle options for example ... it is just the first step returns the whole card range page first and then on page filtering.

                      It does have some limitations I agree... particularly price ranges which I have started work on but shelved until after the site quietens down in the New Year


                      Bikster
                      SellerDeck Designs and Responsive Themes

                      Comment


                        #12
                        @ Hugh,

                        I agree. Search is far more complex these days than just finding all products with the keywords in them, but there's only a few different places the words can exist and it doesn't seem too difficult a process to rank the results in a fairly straight forward basis.

                        I think the use of search is becoming increasingly important and more and more the ability to refine the search results to find the exact thing you're after is something that everyone is getting used to. Google's getting everyone used to doing this where you run a search and then filter on 'images' or 'discussion' or 'news' or 'shopping' etc.

                        Like a lot of people I use amazon a lot and the process I use there is nearly always: Search > Select (filter) a department > Sort by popularity / Review / Price > Select (filter) by category or brand to refine the results.

                        That's the kind if thing I want to be able to do. Seamlessly with as little effort as possible.

                        This just seems to me like the kind of thing that's best done directly on the database as there seems to be too many independent variables to efficiently do any other way.

                        We get all sort orders downloaded to the desktop so sorting and pagination can be done in the desktop. Getting products in one sort order is easy when doing a query, but getting all sort orders at once is not.
                        I see this as a weakness of the current solution. When there's a large number of results it takes a long time to download, and if the user changes or refines their search and/or filter it all has to be done again. I think these days it's nearly always more involved than a single 'search / filter then sort' which means it could well be that it's quicker and more efficient to run a new search query each time and just return the results that are needed for the current page. More often than not the next page is probably going to be based on different search criteria. I think our main tendency nowadays is to refine our results rather than wade through page after page of the wrong ones.

                        Searching / Filtering to me are part and parcel of the same thing and really do need to work seamlessly.

                        Mike
                        -----------------------------------------

                        First Tackle - Fly Fishing and Game Angling

                        -----------------------------------------

                        Comment


                          #13
                          I see this as a weakness of the current solution. When there's a large number of results it takes a long time to download, and if the user changes or refines their search and/or filter it all has to be done again.
                          For filtering, since the first maintenance release two static files are downloaded when the page is opened and all filtering, re-ordering and pagination is done in the JavaScript. It really is fast. There is only ever one call to the Perl script for a filter page after any upload - to regenerate the filter cache files. The first time someone else visits the page they download the filter cache files (without calling Perl) and then when they navigate away and back again, the filter cache files are loaded from the browser cache.

                          Try http://www.theoutdoorgearshop.co.uk/...-Arcteryx.html for example. There are two .js files downloaded which contain the filtering information (find AjaxProduct*.js). If you navigate away from the page and back again then those files are loaded from the browser cache. It's very quick. As they are .js files, compressed download also works.

                          Note that this page could be faster if the default view didn't show the products for filtering. It's a mix of the old duplicate product special section and filtering. Concentrate on filtering after page load.

                          Searching has to call the Perl to get the matching products. This can't use caching of results. But in this case the Perl creates the response page, so the download is limited to the number of products you have in the first page. There isn't a big download.

                          I think our main tendency nowadays is to refine our results rather than wade through page after page of the wrong ones.
                          Agreed. That's what we all do on Google, and why SEO strives to get on the first page. At present in SellerDeck, filtering is the way to refine results rather than using an initial search then refine.

                          At least search in v12 uses Relevance sorting. In v11 it just presents results in the order of their Product Reference which is pretty useless when you have 1000 hits to sort through.

                          Searching / Filtering to me are part and parcel of the same thing and really do need to work seamlessly.
                          Yes. I've always envisaged a "search" box in a filtered page, so you can specify a search word as well as filtering using pre-defined values. Our current architecture would handle this easily (though not cached).

                          The reverse - filtering on search results - has always been possible. But as products can come from different sections, it's difficult to define a rational set of filtering attributes/choices.
                          Hugh Gibson
                          CTO - Sellerdeck, part of ClearCourse

                          Comment


                            #14
                            The reverse - filtering on search results - has always been possible. But as products can come from different sections, it's difficult to define a rational set of filtering attributes/choices.
                            Is there an example where anyone has this working. The Outdoorgear shop has no filtering on the search results.

                            If the user process is search and refine, which we seem to agree on, then it strikes me that the we mostly use search to find an initial product set (and sometimes a re-search) and then filtering to refine it and sort to see it in the order we want to look at it.

                            Filtering then searching just feels all wrong. Why would I filter things first if I'm then going to do a search?

                            As you say the filtering options will need to be different for each section. I guess this is why Amazon asks people to select a department before sorting. They also pre-select the best match for category but allow you to change it.

                            A good example of the need for different filtering is fishing tackle. If you search for 'fly line' it should show you the product results and you might then filter on 'brand' or 'type' (floating/Intermediate/sinking/sink tip) or colour (white /green / hi viz / clear / brown) or AFTM (WF5/WF6/WF7/WF8/WF9) or Use (River/ lake / Sea / Tropical) or Species (Trout / Salmon/ Pike/etc) whereas a search for 'waders' would probably have filtering options like 'Nylon / Neoprene / Breathable) or Foot (Stocking / Boot) or Size (8 / 9 / 10 / 11). etc.

                            At least Brand, Price and In Stock would be common.

                            Can we do this now? Only this is really the way I think it should work.

                            Mike
                            -----------------------------------------

                            First Tackle - Fly Fishing and Game Angling

                            -----------------------------------------

                            Comment


                              #15
                              Having mulled this over a bit, I think I understand what my main problem with filtering is as it's done today; it's not really meant for general use.

                              If I remember correctly, the request for filtering came from people who wanted to do the 'car products' thing. i.e.

                              1. Select Car: Ford
                              2. Select Model: Fiesta
                              3. Select year: 2008-2011
                              4. Select Engine: 1999 cc
                              5. Select Product Type:

                              - Oil Filter
                              - Air Filter
                              - Wiper Blades
                              - etc.

                              The same kind of requests came from printer cartidge sellers / Tyre sellers, etc where you filter down to select the printer you have and the it shows you the printer cartridges, drums, refills, etc that fit that particular printer.

                              For me, and probably 95-98% of Sellerdeck users, this isn't the way our customers shop / try to find products and it has little to no relevance.

                              The problem I see is that people are building shops using it that shouldn't really and when I look at what they're doing I end up shaking my head thinking "god, this is crap."

                              I'm seeing people building whole sites around 'filtering' that just don't work.

                              For me, good design for a mainstream ecommerce site has to based around:

                              1. A proper site structure with straightforward navigation for customers who want to browse. They need to know how to get to a particular section, know where they are and know how to get back to someplace they were before (as in, "ooh, that one I saw before was probably the best one, now where was it again?". Using ad-hoc filtering to move around a site is just the pits for this. (and no good for the search engines either).

                              2. A good search facility that allows customers to search for products and then refine and order their search. This is really what the vast majority of us need and what we're missing right now. In this case 'refine' might mean 'filter' for brand, price, size, etc but it's kind of coming at it from a very different angle / requirement.

                              So where does this leave us? No matter how many ways I look at it, it still strikes me that the current implementation doesn't do what most of us need and is the wrong way to go about it. As soon as you bring search into the equation you're no longer creating a simple hierarchy that allows you dive down specific paths. Products in the results set will come from different sections/areas and each refinement could well require a different set of results. I'm not a systems designer but it seems pretty clear to me that what we need is a backend system that will return a limited number of results extremely quickly and efficiently rather than something that returns the whole set and then 'filters' out the ones not needed. We also need a dynamic set of 'refinement' options to be presented to take into account the different options that are valid for the different product areas.

                              It really does say to me: back end database with SQL.

                              And this is also where I get back to where I started from. Having developed a particular feature for a specific requirement, there's a huge incentive to try and say ' Oh. and this can also be used to do XYZ." You then end up doing more development to try and make it fit something it wasn't intended for in the first place with the danger of wasting multiple development cycles and still not delivering what's really needed.

                              Please, please. Let's not do this again. Filtering as it stands is taking us down the wrong path.

                              Mike
                              -----------------------------------------

                              First Tackle - Fly Fishing and Game Angling

                              -----------------------------------------

                              Comment

                              Working...
                              X