Which columns should be indexed when all may be used in different search queries?












4















Background



I'm working on a website for a movie theather chain currently located in four different cities (might expand in the future). They use the same, single-database website for all cities, which means I have to have a column in certain tables which holds the ID of the city that each row belongs to.



Right now I have three different tables:





  • Cinemas - Contains each city's cinema (ID and name).


  • Movies - Contains all movies that has been/will be shown at the cinema.


  • Showtimes - Contains all showtimes for all movies in all cities.


The structure of the Showtimes table is the following:



Column Name   | Column Type  | Description
--------------+--------------+---------------
ID | BIGINT | (Primary) Unique ID for each showtime (perhaps unnecessary?)
CinemaID | TINYINT | Foreign key bound to Cinemas.ID
MovieID | BIGINT | Foreign key bound to Movies.ID
Showtime | DATETIME | At what date and time the movie will show

(will contain multiple rows for each movie, i.e. one row for each showtime)


How this table will be used



A user of the website must be able to:





  • View all current/upcoming movies and showtimes (sorted by date) in the selected city.



    Example query (backend):



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? ORDER BY Showtime



  • Select a single movie and view all showtimes for that specific title only (in the selected city).



    Example query:



    SELECT Showtime FROM Showtimes WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime



  • Select a single day and view all movies and showtimes for that day only (in the selected city).



    Example query:



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM])



So naturally I decided that I needed to create indexes for the columns.



Problem



What I'm having trouble with is deciding/determining how to index the columns properly. One index for each column seems quite expensive[1][2] so I started looking into composite indexes, which seems to be the right choice but also led to even more confusion.



From my understanding (based on what I've read) you should add the columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?) column the first in the composite index[3] (in my case that would be the Showtime column). The only problem with that is that the index can only be used by the database if the first column is included in the search query[4][5], which it currently isn't in either of my queries.



Question



What kind of index(es) should I apply to my columns in order to cover all usage scenarios? (the last scenario may be omitted, but the first two are required)



Should I use a composite index on all columns, for some columns, or do I need a separate index for each column?



This table is updated at most a few times per week to add new showtimes.



Footnotes



1MySQL indexes - what are the best practices?



2Indexing every column in a table



3How important is the order of columns in indexes? (question)



4How important is the order of columns in indexes? (#2 top-voted answer)



5When should I use a composite index?










share|improve this question









New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • "one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

    – Rick James
    8 hours ago











  • @RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

    – Visual Vincent
    8 hours ago






  • 2





    (Upvote for a well-written Question.)

    – Rick James
    7 hours ago
















4















Background



I'm working on a website for a movie theather chain currently located in four different cities (might expand in the future). They use the same, single-database website for all cities, which means I have to have a column in certain tables which holds the ID of the city that each row belongs to.



Right now I have three different tables:





  • Cinemas - Contains each city's cinema (ID and name).


  • Movies - Contains all movies that has been/will be shown at the cinema.


  • Showtimes - Contains all showtimes for all movies in all cities.


The structure of the Showtimes table is the following:



Column Name   | Column Type  | Description
--------------+--------------+---------------
ID | BIGINT | (Primary) Unique ID for each showtime (perhaps unnecessary?)
CinemaID | TINYINT | Foreign key bound to Cinemas.ID
MovieID | BIGINT | Foreign key bound to Movies.ID
Showtime | DATETIME | At what date and time the movie will show

(will contain multiple rows for each movie, i.e. one row for each showtime)


How this table will be used



A user of the website must be able to:





  • View all current/upcoming movies and showtimes (sorted by date) in the selected city.



    Example query (backend):



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? ORDER BY Showtime



  • Select a single movie and view all showtimes for that specific title only (in the selected city).



    Example query:



    SELECT Showtime FROM Showtimes WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime



  • Select a single day and view all movies and showtimes for that day only (in the selected city).



    Example query:



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM])



So naturally I decided that I needed to create indexes for the columns.



Problem



What I'm having trouble with is deciding/determining how to index the columns properly. One index for each column seems quite expensive[1][2] so I started looking into composite indexes, which seems to be the right choice but also led to even more confusion.



From my understanding (based on what I've read) you should add the columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?) column the first in the composite index[3] (in my case that would be the Showtime column). The only problem with that is that the index can only be used by the database if the first column is included in the search query[4][5], which it currently isn't in either of my queries.



Question



What kind of index(es) should I apply to my columns in order to cover all usage scenarios? (the last scenario may be omitted, but the first two are required)



Should I use a composite index on all columns, for some columns, or do I need a separate index for each column?



This table is updated at most a few times per week to add new showtimes.



Footnotes



1MySQL indexes - what are the best practices?



2Indexing every column in a table



3How important is the order of columns in indexes? (question)



4How important is the order of columns in indexes? (#2 top-voted answer)



5When should I use a composite index?










share|improve this question









New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • "one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

    – Rick James
    8 hours ago











  • @RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

    – Visual Vincent
    8 hours ago






  • 2





    (Upvote for a well-written Question.)

    – Rick James
    7 hours ago














4












4








4








Background



I'm working on a website for a movie theather chain currently located in four different cities (might expand in the future). They use the same, single-database website for all cities, which means I have to have a column in certain tables which holds the ID of the city that each row belongs to.



Right now I have three different tables:





  • Cinemas - Contains each city's cinema (ID and name).


  • Movies - Contains all movies that has been/will be shown at the cinema.


  • Showtimes - Contains all showtimes for all movies in all cities.


The structure of the Showtimes table is the following:



Column Name   | Column Type  | Description
--------------+--------------+---------------
ID | BIGINT | (Primary) Unique ID for each showtime (perhaps unnecessary?)
CinemaID | TINYINT | Foreign key bound to Cinemas.ID
MovieID | BIGINT | Foreign key bound to Movies.ID
Showtime | DATETIME | At what date and time the movie will show

(will contain multiple rows for each movie, i.e. one row for each showtime)


How this table will be used



A user of the website must be able to:





  • View all current/upcoming movies and showtimes (sorted by date) in the selected city.



    Example query (backend):



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? ORDER BY Showtime



  • Select a single movie and view all showtimes for that specific title only (in the selected city).



    Example query:



    SELECT Showtime FROM Showtimes WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime



  • Select a single day and view all movies and showtimes for that day only (in the selected city).



    Example query:



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM])



So naturally I decided that I needed to create indexes for the columns.



Problem



What I'm having trouble with is deciding/determining how to index the columns properly. One index for each column seems quite expensive[1][2] so I started looking into composite indexes, which seems to be the right choice but also led to even more confusion.



From my understanding (based on what I've read) you should add the columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?) column the first in the composite index[3] (in my case that would be the Showtime column). The only problem with that is that the index can only be used by the database if the first column is included in the search query[4][5], which it currently isn't in either of my queries.



Question



What kind of index(es) should I apply to my columns in order to cover all usage scenarios? (the last scenario may be omitted, but the first two are required)



Should I use a composite index on all columns, for some columns, or do I need a separate index for each column?



This table is updated at most a few times per week to add new showtimes.



Footnotes



1MySQL indexes - what are the best practices?



2Indexing every column in a table



3How important is the order of columns in indexes? (question)



4How important is the order of columns in indexes? (#2 top-voted answer)



5When should I use a composite index?










share|improve this question









New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












Background



I'm working on a website for a movie theather chain currently located in four different cities (might expand in the future). They use the same, single-database website for all cities, which means I have to have a column in certain tables which holds the ID of the city that each row belongs to.



Right now I have three different tables:





  • Cinemas - Contains each city's cinema (ID and name).


  • Movies - Contains all movies that has been/will be shown at the cinema.


  • Showtimes - Contains all showtimes for all movies in all cities.


The structure of the Showtimes table is the following:



Column Name   | Column Type  | Description
--------------+--------------+---------------
ID | BIGINT | (Primary) Unique ID for each showtime (perhaps unnecessary?)
CinemaID | TINYINT | Foreign key bound to Cinemas.ID
MovieID | BIGINT | Foreign key bound to Movies.ID
Showtime | DATETIME | At what date and time the movie will show

(will contain multiple rows for each movie, i.e. one row for each showtime)


How this table will be used



A user of the website must be able to:





  • View all current/upcoming movies and showtimes (sorted by date) in the selected city.



    Example query (backend):



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? ORDER BY Showtime



  • Select a single movie and view all showtimes for that specific title only (in the selected city).



    Example query:



    SELECT Showtime FROM Showtimes WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime



  • Select a single day and view all movies and showtimes for that day only (in the selected city).



    Example query:



    SELECT MovieID, Showtime FROM Showtimes WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM])



So naturally I decided that I needed to create indexes for the columns.



Problem



What I'm having trouble with is deciding/determining how to index the columns properly. One index for each column seems quite expensive[1][2] so I started looking into composite indexes, which seems to be the right choice but also led to even more confusion.



From my understanding (based on what I've read) you should add the columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?) column the first in the composite index[3] (in my case that would be the Showtime column). The only problem with that is that the index can only be used by the database if the first column is included in the search query[4][5], which it currently isn't in either of my queries.



Question



What kind of index(es) should I apply to my columns in order to cover all usage scenarios? (the last scenario may be omitted, but the first two are required)



Should I use a composite index on all columns, for some columns, or do I need a separate index for each column?



This table is updated at most a few times per week to add new showtimes.



Footnotes



1MySQL indexes - what are the best practices?



2Indexing every column in a table



3How important is the order of columns in indexes? (question)



4How important is the order of columns in indexes? (#2 top-voted answer)



5When should I use a composite index?







mysql database-design index innodb mariadb






share|improve this question









New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 7 hours ago







Visual Vincent













New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 8 hours ago









Visual VincentVisual Vincent

1236




1236




New contributor




Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Visual Vincent is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • "one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

    – Rick James
    8 hours ago











  • @RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

    – Visual Vincent
    8 hours ago






  • 2





    (Upvote for a well-written Question.)

    – Rick James
    7 hours ago



















  • "one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

    – Rick James
    8 hours ago











  • @RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

    – Visual Vincent
    8 hours ago






  • 2





    (Upvote for a well-written Question.)

    – Rick James
    7 hours ago

















"one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

– Rick James
8 hours ago





"one row for each showtime" -- What about a cinaplex that has multiple screens with the same movie at the same time?

– Rick James
8 hours ago













@RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

– Visual Vincent
8 hours ago





@RickJames : This chain (which is relatively small) currently only have one screen in each theater. But in case they'd get a second, would it be a problem to add another column for that without an index if we don't need to search for it (only retreive it)?

– Visual Vincent
8 hours ago




2




2





(Upvote for a well-written Question.)

– Rick James
7 hours ago





(Upvote for a well-written Question.)

– Rick James
7 hours ago










3 Answers
3






active

oldest

votes


















4














Composite Primary Key



I would define the primary key as a composite key of (CinemaID, MovieID, Showtime).



These 3 columns uniquely identify each row, and so having a separate ID column is not necessary.



Composite (Secondary) Index



With this PK, the only additional index you will need for your queries is (CinemaID, Showtime).



Why these indexes?



A good way to think about indexes are used is to think of them as the order of columns in a spreadsheet.



Imagine a spreadsheed with (CinemaID, MovieID, Showtime) that is sorted by each column successively.



All your queries have CinemaID present, which means you can quickly find the "section" of the spreadsheet for that CinemaID. Then, for your query that searches by MovieID, you can easily find the "subsection" in the 2nd column, where MovieID matches the searched for value.



As the 3rd column of Showtime is also sorted, you can imagine how quickly and easily it is to find all those show times for that movie, in that cinema. The DBMS does things in a similar way and can retrieve those results extremely quickly.



As for your other queries, they all start with CinemaID and then use Showtime in some manner. They also need the MovieID in their results.



So, the (CinemaID, Showtime) index has you covered there. Again, the CinemaID easily finds the "section" of the spreadsheet (in the analogy), and all possible showtimes (and there are going to be duplicates, assuming there is more than one screen) will be listed in order and are easily searched and/or sorted by those values.



Even better, since your primary key includes MovieID, that column is included for all secondary indexes after the defined columns (at least for MySQL InnoDB - others engines too, but not necessarily all.)



Think of that is a 3rd column in the "spreadsheet" of our secondary index. The reason the column exists is to have all parts of the primary key available to do a lookup to the main table (aka the clustered index, in InnoDB), if necessary. In this simple case, no lookup is needed, so it is even more efficient as it doesn't require that double lookup.



Using just this primary key and a single secondary index, you should obtain excellent performance on any of the queries you listed.



Afterthought



My assumption that this uniquely identifies each row may NOT be correct if you have a movie showing on multiple screens at the same time. If you want to be able to separately identify those screens, then my solution is not best (I can provide another solution for that situation, please just let me know.)






share|improve this answer


























  • Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

    – Visual Vincent
    8 hours ago








  • 1





    Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

    – Willem Renzema
    7 hours ago



















4














WHERE CinemaID = ? ORDER BY Showtime  -- and
WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM]) -- need:
INDEX(CinemaID, Showtime)

WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime -- needs:
INDEX(CinemaID, MovieId, Showtime) -- or
INDEX(MovieId, CinemaID, Showtime)


Assuming that the the triple (MovieId, CinemaID, Showtime) is unique, I suggest getting rid of id and having



PRIMARY KEY(CinemaID, MovieId, Showtime)
INDEX(CinemaID, Showtime)


Are there cases where the main part of the WHERE is MovieID=...?



Cinemas - A list of each city and its cinema (ID and name):



SELECT Cinema, CinemaID FROM Cimemas;  -- (no index needed)


Movies - A list of movies that has been/will be shown at the cinema.



SELECT DISTINCT MovieID FROM ShowTimes WHERE CinemaID=...
INDEX(CinemaID, MovieID) -- already handled by my proposed PK


Showtimes - A list of all showtimes for all movies in all cities. -- This is a huge output; rethink the requirement for it. That is, think about what the Client will do with it.



Most of these indexes could be deduced from studying indexing cookbook and composite indexing



Note how I did not suggest indexes (single-column vs composite, plus specifics) until I saw the SELECTs.



"columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?)" -- No. Selectivity is not the key for designing a composite index. Start with all the = columns, in any order. (My Cookbook dwells on this topic.)



"index can only be used by the table if the first column is included in the search query" -- Mostly true. Notice how I recommended 2 indexes (remember: the PK is an index). There are cases where an index can be used for GROUP BY or ORDER BY, ignoring the WHERE; but those are rare.



"first in the composite index ... Showtime" -- It is usually counterproductive to put a DATETIME first in a composite index. In particular, your 3rd query can use both columns of INDEX(CinemaID, Showtime), but not both columns of INDEX(Showtime, CinemaID). It is easy to see this: Think of writing out two lists of cinemas and showtimes. Have one list sorted first on cinema (a la INDEX(cinema, time); have the other sorted on time. Think about which one would have all the rows for a particular cinema over a time range clumped ("clustered") together.



If there is also a ScreenID due to two screens showing the same movie at the same time, tack it onto the PK. However, all the indexes will need rethinking.



(Sorry, Willem, I wrote my Answer before reading yours -- We say almost the same stuff.)






share|improve this answer


























  • Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

    – Visual Vincent
    7 hours ago













  • Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

    – Visual Vincent
    7 hours ago






  • 1





    @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

    – Rick James
    7 hours ago






  • 1





    @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

    – Rick James
    7 hours ago






  • 1





    @VisualVincent - Yes, ScreenID last.

    – Rick James
    7 hours ago



















0














First I would definitely not define the primary key as a composite key of (CinemaID, MovieID, Showtime). Think of what that means: if you want to link an entry you always need those data instead of just an id.



When you create a database I would recommend you to stick to the rules of normalization. They were made because of a reason.
I understand that you want to create it as fast and memory-saving as possible, but we have no idea yet were the bottle-neck will be.
Just create the scheme like you would do it perfectly normalized. Then fill it with test data (a few million entries) and see if the performance seriously is affected.
When you do that you also have some numbers like "this query takes 2 times more memory than the others", and then you can have a look at this.



Currently this smells of premature optimization .



Performance in software is like taste in food: you can guess if it is good, but in the end you have to try it anyway.






share|improve this answer








New contributor




Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Visual Vincent is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229064%2fwhich-columns-should-be-indexed-when-all-may-be-used-in-different-search-queries%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4














    Composite Primary Key



    I would define the primary key as a composite key of (CinemaID, MovieID, Showtime).



    These 3 columns uniquely identify each row, and so having a separate ID column is not necessary.



    Composite (Secondary) Index



    With this PK, the only additional index you will need for your queries is (CinemaID, Showtime).



    Why these indexes?



    A good way to think about indexes are used is to think of them as the order of columns in a spreadsheet.



    Imagine a spreadsheed with (CinemaID, MovieID, Showtime) that is sorted by each column successively.



    All your queries have CinemaID present, which means you can quickly find the "section" of the spreadsheet for that CinemaID. Then, for your query that searches by MovieID, you can easily find the "subsection" in the 2nd column, where MovieID matches the searched for value.



    As the 3rd column of Showtime is also sorted, you can imagine how quickly and easily it is to find all those show times for that movie, in that cinema. The DBMS does things in a similar way and can retrieve those results extremely quickly.



    As for your other queries, they all start with CinemaID and then use Showtime in some manner. They also need the MovieID in their results.



    So, the (CinemaID, Showtime) index has you covered there. Again, the CinemaID easily finds the "section" of the spreadsheet (in the analogy), and all possible showtimes (and there are going to be duplicates, assuming there is more than one screen) will be listed in order and are easily searched and/or sorted by those values.



    Even better, since your primary key includes MovieID, that column is included for all secondary indexes after the defined columns (at least for MySQL InnoDB - others engines too, but not necessarily all.)



    Think of that is a 3rd column in the "spreadsheet" of our secondary index. The reason the column exists is to have all parts of the primary key available to do a lookup to the main table (aka the clustered index, in InnoDB), if necessary. In this simple case, no lookup is needed, so it is even more efficient as it doesn't require that double lookup.



    Using just this primary key and a single secondary index, you should obtain excellent performance on any of the queries you listed.



    Afterthought



    My assumption that this uniquely identifies each row may NOT be correct if you have a movie showing on multiple screens at the same time. If you want to be able to separately identify those screens, then my solution is not best (I can provide another solution for that situation, please just let me know.)






    share|improve this answer


























    • Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

      – Visual Vincent
      8 hours ago








    • 1





      Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

      – Willem Renzema
      7 hours ago
















    4














    Composite Primary Key



    I would define the primary key as a composite key of (CinemaID, MovieID, Showtime).



    These 3 columns uniquely identify each row, and so having a separate ID column is not necessary.



    Composite (Secondary) Index



    With this PK, the only additional index you will need for your queries is (CinemaID, Showtime).



    Why these indexes?



    A good way to think about indexes are used is to think of them as the order of columns in a spreadsheet.



    Imagine a spreadsheed with (CinemaID, MovieID, Showtime) that is sorted by each column successively.



    All your queries have CinemaID present, which means you can quickly find the "section" of the spreadsheet for that CinemaID. Then, for your query that searches by MovieID, you can easily find the "subsection" in the 2nd column, where MovieID matches the searched for value.



    As the 3rd column of Showtime is also sorted, you can imagine how quickly and easily it is to find all those show times for that movie, in that cinema. The DBMS does things in a similar way and can retrieve those results extremely quickly.



    As for your other queries, they all start with CinemaID and then use Showtime in some manner. They also need the MovieID in their results.



    So, the (CinemaID, Showtime) index has you covered there. Again, the CinemaID easily finds the "section" of the spreadsheet (in the analogy), and all possible showtimes (and there are going to be duplicates, assuming there is more than one screen) will be listed in order and are easily searched and/or sorted by those values.



    Even better, since your primary key includes MovieID, that column is included for all secondary indexes after the defined columns (at least for MySQL InnoDB - others engines too, but not necessarily all.)



    Think of that is a 3rd column in the "spreadsheet" of our secondary index. The reason the column exists is to have all parts of the primary key available to do a lookup to the main table (aka the clustered index, in InnoDB), if necessary. In this simple case, no lookup is needed, so it is even more efficient as it doesn't require that double lookup.



    Using just this primary key and a single secondary index, you should obtain excellent performance on any of the queries you listed.



    Afterthought



    My assumption that this uniquely identifies each row may NOT be correct if you have a movie showing on multiple screens at the same time. If you want to be able to separately identify those screens, then my solution is not best (I can provide another solution for that situation, please just let me know.)






    share|improve this answer


























    • Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

      – Visual Vincent
      8 hours ago








    • 1





      Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

      – Willem Renzema
      7 hours ago














    4












    4








    4







    Composite Primary Key



    I would define the primary key as a composite key of (CinemaID, MovieID, Showtime).



    These 3 columns uniquely identify each row, and so having a separate ID column is not necessary.



    Composite (Secondary) Index



    With this PK, the only additional index you will need for your queries is (CinemaID, Showtime).



    Why these indexes?



    A good way to think about indexes are used is to think of them as the order of columns in a spreadsheet.



    Imagine a spreadsheed with (CinemaID, MovieID, Showtime) that is sorted by each column successively.



    All your queries have CinemaID present, which means you can quickly find the "section" of the spreadsheet for that CinemaID. Then, for your query that searches by MovieID, you can easily find the "subsection" in the 2nd column, where MovieID matches the searched for value.



    As the 3rd column of Showtime is also sorted, you can imagine how quickly and easily it is to find all those show times for that movie, in that cinema. The DBMS does things in a similar way and can retrieve those results extremely quickly.



    As for your other queries, they all start with CinemaID and then use Showtime in some manner. They also need the MovieID in their results.



    So, the (CinemaID, Showtime) index has you covered there. Again, the CinemaID easily finds the "section" of the spreadsheet (in the analogy), and all possible showtimes (and there are going to be duplicates, assuming there is more than one screen) will be listed in order and are easily searched and/or sorted by those values.



    Even better, since your primary key includes MovieID, that column is included for all secondary indexes after the defined columns (at least for MySQL InnoDB - others engines too, but not necessarily all.)



    Think of that is a 3rd column in the "spreadsheet" of our secondary index. The reason the column exists is to have all parts of the primary key available to do a lookup to the main table (aka the clustered index, in InnoDB), if necessary. In this simple case, no lookup is needed, so it is even more efficient as it doesn't require that double lookup.



    Using just this primary key and a single secondary index, you should obtain excellent performance on any of the queries you listed.



    Afterthought



    My assumption that this uniquely identifies each row may NOT be correct if you have a movie showing on multiple screens at the same time. If you want to be able to separately identify those screens, then my solution is not best (I can provide another solution for that situation, please just let me know.)






    share|improve this answer















    Composite Primary Key



    I would define the primary key as a composite key of (CinemaID, MovieID, Showtime).



    These 3 columns uniquely identify each row, and so having a separate ID column is not necessary.



    Composite (Secondary) Index



    With this PK, the only additional index you will need for your queries is (CinemaID, Showtime).



    Why these indexes?



    A good way to think about indexes are used is to think of them as the order of columns in a spreadsheet.



    Imagine a spreadsheed with (CinemaID, MovieID, Showtime) that is sorted by each column successively.



    All your queries have CinemaID present, which means you can quickly find the "section" of the spreadsheet for that CinemaID. Then, for your query that searches by MovieID, you can easily find the "subsection" in the 2nd column, where MovieID matches the searched for value.



    As the 3rd column of Showtime is also sorted, you can imagine how quickly and easily it is to find all those show times for that movie, in that cinema. The DBMS does things in a similar way and can retrieve those results extremely quickly.



    As for your other queries, they all start with CinemaID and then use Showtime in some manner. They also need the MovieID in their results.



    So, the (CinemaID, Showtime) index has you covered there. Again, the CinemaID easily finds the "section" of the spreadsheet (in the analogy), and all possible showtimes (and there are going to be duplicates, assuming there is more than one screen) will be listed in order and are easily searched and/or sorted by those values.



    Even better, since your primary key includes MovieID, that column is included for all secondary indexes after the defined columns (at least for MySQL InnoDB - others engines too, but not necessarily all.)



    Think of that is a 3rd column in the "spreadsheet" of our secondary index. The reason the column exists is to have all parts of the primary key available to do a lookup to the main table (aka the clustered index, in InnoDB), if necessary. In this simple case, no lookup is needed, so it is even more efficient as it doesn't require that double lookup.



    Using just this primary key and a single secondary index, you should obtain excellent performance on any of the queries you listed.



    Afterthought



    My assumption that this uniquely identifies each row may NOT be correct if you have a movie showing on multiple screens at the same time. If you want to be able to separately identify those screens, then my solution is not best (I can provide another solution for that situation, please just let me know.)







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 8 hours ago

























    answered 8 hours ago









    Willem RenzemaWillem Renzema

    1,173168




    1,173168













    • Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

      – Visual Vincent
      8 hours ago








    • 1





      Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

      – Willem Renzema
      7 hours ago



















    • Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

      – Visual Vincent
      8 hours ago








    • 1





      Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

      – Willem Renzema
      7 hours ago

















    Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

    – Visual Vincent
    8 hours ago







    Thanks for your answer and the thorough explanation! Currently the theather chain has only one screen in each theater. But would it, in case they got a second, be a problem to add another column for that without an index, if we don't need to search for/sort by the screen ID (only retreive it)?

    – Visual Vincent
    8 hours ago






    1




    1





    Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

    – Willem Renzema
    7 hours ago





    Adding a screenID won't be an issue UNLESS you need to have the situation where, at a given cinema, there is the same movie playing on both screens, AND starting at the same time. In that case, the primary key of (CinemaID, MovieID, Showtime) does not permit you to enter both ScreenIDs, because only one record for each combination will exist. However, that said, and given that you say you don't need to sort by it, you CAN add ScreenID as the 4th column of the primary key. In that case, you can store both identical showings, and still differentiate by screen. So, either way you are fine.

    – Willem Renzema
    7 hours ago













    4














    WHERE CinemaID = ? ORDER BY Showtime  -- and
    WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM]) -- need:
    INDEX(CinemaID, Showtime)

    WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime -- needs:
    INDEX(CinemaID, MovieId, Showtime) -- or
    INDEX(MovieId, CinemaID, Showtime)


    Assuming that the the triple (MovieId, CinemaID, Showtime) is unique, I suggest getting rid of id and having



    PRIMARY KEY(CinemaID, MovieId, Showtime)
    INDEX(CinemaID, Showtime)


    Are there cases where the main part of the WHERE is MovieID=...?



    Cinemas - A list of each city and its cinema (ID and name):



    SELECT Cinema, CinemaID FROM Cimemas;  -- (no index needed)


    Movies - A list of movies that has been/will be shown at the cinema.



    SELECT DISTINCT MovieID FROM ShowTimes WHERE CinemaID=...
    INDEX(CinemaID, MovieID) -- already handled by my proposed PK


    Showtimes - A list of all showtimes for all movies in all cities. -- This is a huge output; rethink the requirement for it. That is, think about what the Client will do with it.



    Most of these indexes could be deduced from studying indexing cookbook and composite indexing



    Note how I did not suggest indexes (single-column vs composite, plus specifics) until I saw the SELECTs.



    "columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?)" -- No. Selectivity is not the key for designing a composite index. Start with all the = columns, in any order. (My Cookbook dwells on this topic.)



    "index can only be used by the table if the first column is included in the search query" -- Mostly true. Notice how I recommended 2 indexes (remember: the PK is an index). There are cases where an index can be used for GROUP BY or ORDER BY, ignoring the WHERE; but those are rare.



    "first in the composite index ... Showtime" -- It is usually counterproductive to put a DATETIME first in a composite index. In particular, your 3rd query can use both columns of INDEX(CinemaID, Showtime), but not both columns of INDEX(Showtime, CinemaID). It is easy to see this: Think of writing out two lists of cinemas and showtimes. Have one list sorted first on cinema (a la INDEX(cinema, time); have the other sorted on time. Think about which one would have all the rows for a particular cinema over a time range clumped ("clustered") together.



    If there is also a ScreenID due to two screens showing the same movie at the same time, tack it onto the PK. However, all the indexes will need rethinking.



    (Sorry, Willem, I wrote my Answer before reading yours -- We say almost the same stuff.)






    share|improve this answer


























    • Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

      – Visual Vincent
      7 hours ago













    • Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

      – Visual Vincent
      7 hours ago






    • 1





      @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - Yes, ScreenID last.

      – Rick James
      7 hours ago
















    4














    WHERE CinemaID = ? ORDER BY Showtime  -- and
    WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM]) -- need:
    INDEX(CinemaID, Showtime)

    WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime -- needs:
    INDEX(CinemaID, MovieId, Showtime) -- or
    INDEX(MovieId, CinemaID, Showtime)


    Assuming that the the triple (MovieId, CinemaID, Showtime) is unique, I suggest getting rid of id and having



    PRIMARY KEY(CinemaID, MovieId, Showtime)
    INDEX(CinemaID, Showtime)


    Are there cases where the main part of the WHERE is MovieID=...?



    Cinemas - A list of each city and its cinema (ID and name):



    SELECT Cinema, CinemaID FROM Cimemas;  -- (no index needed)


    Movies - A list of movies that has been/will be shown at the cinema.



    SELECT DISTINCT MovieID FROM ShowTimes WHERE CinemaID=...
    INDEX(CinemaID, MovieID) -- already handled by my proposed PK


    Showtimes - A list of all showtimes for all movies in all cities. -- This is a huge output; rethink the requirement for it. That is, think about what the Client will do with it.



    Most of these indexes could be deduced from studying indexing cookbook and composite indexing



    Note how I did not suggest indexes (single-column vs composite, plus specifics) until I saw the SELECTs.



    "columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?)" -- No. Selectivity is not the key for designing a composite index. Start with all the = columns, in any order. (My Cookbook dwells on this topic.)



    "index can only be used by the table if the first column is included in the search query" -- Mostly true. Notice how I recommended 2 indexes (remember: the PK is an index). There are cases where an index can be used for GROUP BY or ORDER BY, ignoring the WHERE; but those are rare.



    "first in the composite index ... Showtime" -- It is usually counterproductive to put a DATETIME first in a composite index. In particular, your 3rd query can use both columns of INDEX(CinemaID, Showtime), but not both columns of INDEX(Showtime, CinemaID). It is easy to see this: Think of writing out two lists of cinemas and showtimes. Have one list sorted first on cinema (a la INDEX(cinema, time); have the other sorted on time. Think about which one would have all the rows for a particular cinema over a time range clumped ("clustered") together.



    If there is also a ScreenID due to two screens showing the same movie at the same time, tack it onto the PK. However, all the indexes will need rethinking.



    (Sorry, Willem, I wrote my Answer before reading yours -- We say almost the same stuff.)






    share|improve this answer


























    • Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

      – Visual Vincent
      7 hours ago













    • Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

      – Visual Vincent
      7 hours ago






    • 1





      @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - Yes, ScreenID last.

      – Rick James
      7 hours ago














    4












    4








    4







    WHERE CinemaID = ? ORDER BY Showtime  -- and
    WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM]) -- need:
    INDEX(CinemaID, Showtime)

    WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime -- needs:
    INDEX(CinemaID, MovieId, Showtime) -- or
    INDEX(MovieId, CinemaID, Showtime)


    Assuming that the the triple (MovieId, CinemaID, Showtime) is unique, I suggest getting rid of id and having



    PRIMARY KEY(CinemaID, MovieId, Showtime)
    INDEX(CinemaID, Showtime)


    Are there cases where the main part of the WHERE is MovieID=...?



    Cinemas - A list of each city and its cinema (ID and name):



    SELECT Cinema, CinemaID FROM Cimemas;  -- (no index needed)


    Movies - A list of movies that has been/will be shown at the cinema.



    SELECT DISTINCT MovieID FROM ShowTimes WHERE CinemaID=...
    INDEX(CinemaID, MovieID) -- already handled by my proposed PK


    Showtimes - A list of all showtimes for all movies in all cities. -- This is a huge output; rethink the requirement for it. That is, think about what the Client will do with it.



    Most of these indexes could be deduced from studying indexing cookbook and composite indexing



    Note how I did not suggest indexes (single-column vs composite, plus specifics) until I saw the SELECTs.



    "columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?)" -- No. Selectivity is not the key for designing a composite index. Start with all the = columns, in any order. (My Cookbook dwells on this topic.)



    "index can only be used by the table if the first column is included in the search query" -- Mostly true. Notice how I recommended 2 indexes (remember: the PK is an index). There are cases where an index can be used for GROUP BY or ORDER BY, ignoring the WHERE; but those are rare.



    "first in the composite index ... Showtime" -- It is usually counterproductive to put a DATETIME first in a composite index. In particular, your 3rd query can use both columns of INDEX(CinemaID, Showtime), but not both columns of INDEX(Showtime, CinemaID). It is easy to see this: Think of writing out two lists of cinemas and showtimes. Have one list sorted first on cinema (a la INDEX(cinema, time); have the other sorted on time. Think about which one would have all the rows for a particular cinema over a time range clumped ("clustered") together.



    If there is also a ScreenID due to two screens showing the same movie at the same time, tack it onto the PK. However, all the indexes will need rethinking.



    (Sorry, Willem, I wrote my Answer before reading yours -- We say almost the same stuff.)






    share|improve this answer















    WHERE CinemaID = ? ORDER BY Showtime  -- and
    WHERE CinemaID = ? AND (Showtime BETWEEN [date 12:00 AM] AND [date 12:00 PM]) -- need:
    INDEX(CinemaID, Showtime)

    WHERE CinemaID = ? AND MovieID = ? ORDER BY Showtime -- needs:
    INDEX(CinemaID, MovieId, Showtime) -- or
    INDEX(MovieId, CinemaID, Showtime)


    Assuming that the the triple (MovieId, CinemaID, Showtime) is unique, I suggest getting rid of id and having



    PRIMARY KEY(CinemaID, MovieId, Showtime)
    INDEX(CinemaID, Showtime)


    Are there cases where the main part of the WHERE is MovieID=...?



    Cinemas - A list of each city and its cinema (ID and name):



    SELECT Cinema, CinemaID FROM Cimemas;  -- (no index needed)


    Movies - A list of movies that has been/will be shown at the cinema.



    SELECT DISTINCT MovieID FROM ShowTimes WHERE CinemaID=...
    INDEX(CinemaID, MovieID) -- already handled by my proposed PK


    Showtimes - A list of all showtimes for all movies in all cities. -- This is a huge output; rethink the requirement for it. That is, think about what the Client will do with it.



    Most of these indexes could be deduced from studying indexing cookbook and composite indexing



    Note how I did not suggest indexes (single-column vs composite, plus specifics) until I saw the SELECTs.



    "columns to the index by order of selectivity, making the most selective (I'm guessing that means the most unique/with the most cardinality?)" -- No. Selectivity is not the key for designing a composite index. Start with all the = columns, in any order. (My Cookbook dwells on this topic.)



    "index can only be used by the table if the first column is included in the search query" -- Mostly true. Notice how I recommended 2 indexes (remember: the PK is an index). There are cases where an index can be used for GROUP BY or ORDER BY, ignoring the WHERE; but those are rare.



    "first in the composite index ... Showtime" -- It is usually counterproductive to put a DATETIME first in a composite index. In particular, your 3rd query can use both columns of INDEX(CinemaID, Showtime), but not both columns of INDEX(Showtime, CinemaID). It is easy to see this: Think of writing out two lists of cinemas and showtimes. Have one list sorted first on cinema (a la INDEX(cinema, time); have the other sorted on time. Think about which one would have all the rows for a particular cinema over a time range clumped ("clustered") together.



    If there is also a ScreenID due to two screens showing the same movie at the same time, tack it onto the PK. However, all the indexes will need rethinking.



    (Sorry, Willem, I wrote my Answer before reading yours -- We say almost the same stuff.)







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 6 hours ago

























    answered 8 hours ago









    Rick JamesRick James

    42.3k22258




    42.3k22258













    • Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

      – Visual Vincent
      7 hours ago













    • Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

      – Visual Vincent
      7 hours ago






    • 1





      @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - Yes, ScreenID last.

      – Rick James
      7 hours ago



















    • Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

      – Visual Vincent
      7 hours ago













    • Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

      – Visual Vincent
      7 hours ago






    • 1





      @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

      – Rick James
      7 hours ago






    • 1





      @VisualVincent - Yes, ScreenID last.

      – Rick James
      7 hours ago

















    Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

    – Visual Vincent
    7 hours ago







    Sorry for the confusion. The first three bullet points are just a description of what tables I have and what they contain (i.e. "A list of all showtimes for all movies in all cities" means the table contains all the showtimes across the entire website, but they're not output all at once - they are what's queried by the next three bullet points located under "On the site the user must be able to").

    – Visual Vincent
    7 hours ago















    Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

    – Visual Vincent
    7 hours ago





    Edited question for (hopefully) better clarity. Now, I have a couple questions regarding your answer: 1) "Are there cases where the main part of the WHERE is MovieID=...?" - Do you mean when it's only checking MovieID or when it's checking MovieID first? (if the former: Not at the moment || if the latter: Also not at the moment, how exactly does it make things different?)

    – Visual Vincent
    7 hours ago




    1




    1





    @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

    – Rick James
    7 hours ago





    @VisualVincent - The order of columns in an index is important. If, for example, ScreenID is stuck in the middle of the index, but not used in the WHERE, the rest of that index is useless for that WHERE.

    – Rick James
    7 hours ago




    1




    1





    @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

    – Rick James
    7 hours ago





    @VisualVincent - A messy query: "Show all Tuesday matinee showings of G-rated movies." The likely technique is to index some of the stuff, then work harder to filter the rest of criteria.

    – Rick James
    7 hours ago




    1




    1





    @VisualVincent - Yes, ScreenID last.

    – Rick James
    7 hours ago





    @VisualVincent - Yes, ScreenID last.

    – Rick James
    7 hours ago











    0














    First I would definitely not define the primary key as a composite key of (CinemaID, MovieID, Showtime). Think of what that means: if you want to link an entry you always need those data instead of just an id.



    When you create a database I would recommend you to stick to the rules of normalization. They were made because of a reason.
    I understand that you want to create it as fast and memory-saving as possible, but we have no idea yet were the bottle-neck will be.
    Just create the scheme like you would do it perfectly normalized. Then fill it with test data (a few million entries) and see if the performance seriously is affected.
    When you do that you also have some numbers like "this query takes 2 times more memory than the others", and then you can have a look at this.



    Currently this smells of premature optimization .



    Performance in software is like taste in food: you can guess if it is good, but in the end you have to try it anyway.






    share|improve this answer








    New contributor




    Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.

























      0














      First I would definitely not define the primary key as a composite key of (CinemaID, MovieID, Showtime). Think of what that means: if you want to link an entry you always need those data instead of just an id.



      When you create a database I would recommend you to stick to the rules of normalization. They were made because of a reason.
      I understand that you want to create it as fast and memory-saving as possible, but we have no idea yet were the bottle-neck will be.
      Just create the scheme like you would do it perfectly normalized. Then fill it with test data (a few million entries) and see if the performance seriously is affected.
      When you do that you also have some numbers like "this query takes 2 times more memory than the others", and then you can have a look at this.



      Currently this smells of premature optimization .



      Performance in software is like taste in food: you can guess if it is good, but in the end you have to try it anyway.






      share|improve this answer








      New contributor




      Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.























        0












        0








        0







        First I would definitely not define the primary key as a composite key of (CinemaID, MovieID, Showtime). Think of what that means: if you want to link an entry you always need those data instead of just an id.



        When you create a database I would recommend you to stick to the rules of normalization. They were made because of a reason.
        I understand that you want to create it as fast and memory-saving as possible, but we have no idea yet were the bottle-neck will be.
        Just create the scheme like you would do it perfectly normalized. Then fill it with test data (a few million entries) and see if the performance seriously is affected.
        When you do that you also have some numbers like "this query takes 2 times more memory than the others", and then you can have a look at this.



        Currently this smells of premature optimization .



        Performance in software is like taste in food: you can guess if it is good, but in the end you have to try it anyway.






        share|improve this answer








        New contributor




        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.










        First I would definitely not define the primary key as a composite key of (CinemaID, MovieID, Showtime). Think of what that means: if you want to link an entry you always need those data instead of just an id.



        When you create a database I would recommend you to stick to the rules of normalization. They were made because of a reason.
        I understand that you want to create it as fast and memory-saving as possible, but we have no idea yet were the bottle-neck will be.
        Just create the scheme like you would do it perfectly normalized. Then fill it with test data (a few million entries) and see if the performance seriously is affected.
        When you do that you also have some numbers like "this query takes 2 times more memory than the others", and then you can have a look at this.



        Currently this smells of premature optimization .



        Performance in software is like taste in food: you can guess if it is good, but in the end you have to try it anyway.







        share|improve this answer








        New contributor




        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer






        New contributor




        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered 48 mins ago









        EmarciEmarci

        1




        1




        New contributor




        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Emarci is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






















            Visual Vincent is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Visual Vincent is a new contributor. Be nice, and check out our Code of Conduct.













            Visual Vincent is a new contributor. Be nice, and check out our Code of Conduct.












            Visual Vincent is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229064%2fwhich-columns-should-be-indexed-when-all-may-be-used-in-different-search-queries%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Polycentropodidae

            Magento 2 Error message: Invalid state change requested

            Paulmy