What Was the Worst Database Decision You Ever Made?

Last Updated 4 years ago

I’m not talking about “oops” mistakes.

I’m talking about things you decided to do on purpose, and then deeply regretted later.

Me personally, my biggest database regret was that time almost twenty years ago when I ran a Profiler trace from my desktop computer.

My very slow desktop computer.

Which was connected to production over a VPN.

<sigh>

I can’t remember what happened first – running out of space on my C drive, or my VPN disconnecting – but in the hot mess of it all, I fielded a call from my manager asking about why the production SQL Server kept freezing up. Everything about that situation was frustrating.

[Video] How to Find Queries Using OPTION RECOMPILE (And Their Parameters)

Page Life Expectancy Doesn’t Mean Jack, and You Should Stop Looking At It.

101 Comments. Leave new

Drew Furgiuele
June 16, 2020 7:51 am

The worst decision I made was during my first big version upgrade of SQL Server.

I had the brightest of ideas that I could just “detatch/attach” databases on one server, then copy the 100’s of GB of databases over the network from Server A to Server B. Then attach again. This was before I knew about features like log shipping. Had I done my homework, I could have had the cold spares almost-synced ahead of time, then do the cutover in minutes instead of a 18-hour marathon of file copies overnight and into a Sunday morning. Live and learn, I guess.

Reply
- Brent Ozar
  June 16, 2020 8:39 am
  
  Drew – I keep meaning to blog about that too! People always forget how cool log shipping is in that scenario.
  
  Reply
  - Greg Loeffler
    June 16, 2020 9:54 am
    
    Log shipping recently helped me migrate a 1.7TB database from on prem to AWS with about 10 minutes downtime instead of hours for copying files. so glad I went that route.
    
    Reply
Keith
June 16, 2020 7:55 am

Ever giving report writers and analysts access to the entire database instead of only the columns their reports consume.

Reply
- Brent Ozar
  June 16, 2020 8:39 am
  
  Keith – ouch! Yeah, that’s rough.
  
  Reply
  - Matt
    June 16, 2020 9:25 am
    
    be interested to know any tips for fixing this.. because y’know.. might be in a similar spot..
    
    Reply
    - Keith
      June 16, 2020 2:36 pm
      
      I talk vent about them badly when they walk away using colorful language. Doesn’t fix it but venting helps me.
      
      Recently I am rebuilding my entire SQL environment and have shut anyone out from the live databases. If they want to report against something they have to ask for the data which I copy to another SQL server they are allowed to query and can tank that server all they want
      
      Reply
    - Vegard Hagen
      June 16, 2020 2:39 pm
      
      Permissions can be granted on specific objects, like a table, a view or a stored procedure. The smart way to work this is to grant everyone exactly the permissions they need on exactly the objects they need to work on. Granting “dbowner” on the entire database or “sysadmin” on the entire server is the flat out dumb way. You’ll never need to figure out what permissions they ACTUALLY needed, but you’ll also never be able to STOP them doing anything at all until AFTER they’ve already SCREWED EVERYTHING UP because YOU gave them permission to…!
      
      Reply
      - Keith
        June 16, 2020 4:16 pm
        
        analysts can create plenty of problems and support overhead even with only read access to every table in a database. They’ll eventually start digging around your tables out of some morbid curiosity and tank a server with some horrible query that a query tuning textbook couldn’t even provide an example of, or worse implement a logical error based on assumptions they drew when implementing cat-like curiosity, that works most of the time, but sometimes doesn’t and then you spend days investigating a problem that is hidden within loose queries inside of a report that you can’t search for.
        
        In places that could benefit from object level permissions – I have either worked in places that were small enough where I controlled anything that happened to SQL and it wasn’t necessary, or in a much larger place where deciding anything requires a series of at least 7 meetings on-prem where some participants have to drive into the meeting site from the suburbs (always at least 8 in the morning or earlier) and it was impractical to control things at that granularity.
        
        I’m really hoping my new model of making them tell me exactly what data they need so I can send it to them cuts all this down, so they can operate with autonomy without creating problems for me.
- Dave
  June 16, 2020 9:36 pm
  
  I did that, and the staff used business objects… that was a painful day
  
  Reply
  - OWEN WEST
    June 16, 2020 11:26 pm
    
    Not only that, giving them sysadmin or sa access ‘because they need it’. 🙁
    
    Reply
  - Dragan
    July 16, 2020 6:41 am
    
    If analysts are allowed to query your live database directly of through a BI tool, it is wise to consider setting MAXDOP to 1 or a low value to prevent them starving other users from CPU. RCSI level should also be considered.
    
    Reply
Erik Darling
June 16, 2020 7:56 am

XML

Reply
- Jeff Moden
  June 16, 2020 9:22 am
  
  BWAAA_HAAA!!! You stole my thunder!
  
  Reply
- Recce
  June 16, 2020 2:26 pm
  
  I quite like XML and XPath ….. within reason.
  Honestly came across a guy who was going to do an open schema using XML and 1 table with 3 columns
  
  ID int identity (1, 1) NOT NULL,
  EntityType varchar(1024) NOT NULL,
  Entity XML NOT NULL
  
  What more did he need????
  I thought he needed restraining.
  
  Reply
  - Keith
    June 16, 2020 4:16 pm
    
    I love XML. Or maybe I just really hate JSON and appreciate XML?
    
    Reply
    - Wanderlei Santos
      June 17, 2020 12:37 pm
      
      Watched a presentation on JSON, and why it was supposedly better than XML. Presenter got lost interpreting a multi-level nested JSON…
      
      Reply
      - Keith
        June 18, 2020 11:28 am
        
        Yeah. Nevermind that even the best json interpreters are horrendously slow. Nevermind there is no schema. Nevermind that unless you wrote it yourself, its probably going to be human unreadable. Nevermind that if you export bad data in it that causes the interpreter to choke, it just says the entire file is bad and you have to find where it is yourself. Nevermind that for objects with even moderate complexity you can quickly have something that spreads accross dozens of pages and is impossible to navigate through.
        
        Only thing that matters for json is that it is easier when you are doing highschool level coding classes using examples that are so trivially simple that you would never see them in the real world.
- Peter
  June 17, 2020 5:19 am
  
  Right, I let the developers store massive xml in the database. Had some advantages, but when the app pulls needlessly 20 of these xmls out of the DB (takes a long time and the app might run out of memory) then even the DBA can’t fix it.
  
  Reply
Kevin Hill
June 16, 2020 7:59 am

Tried to learn Oracle once.

Once.

Reply
- Brent Ozar
  June 16, 2020 8:39 am
  
  Kevin – HAHAHA, I made it about 5 minutes into that too, heh.
  
  Reply
  - Kevin Fries
    June 16, 2020 8:54 am
    
    That’s nothing. Try DB2 (Noooooo, don’t!). Then try learning their implementation of Partitioning and their version of Column Store. Dare I mention their version of SSMS?
    
    Legend has it that it was devised by a Korean interrogator in the 50s.
    
    Reply
    - Peter
      June 16, 2020 9:20 am
      
      When it comes to stability, I prefer DB2. Also I like the Trusted Context (e.g. you only get permission if you come from Server xyz). … and yes, DB2 is rough around the corners. ….. and it has many corners.
      
      Reply
- Kenneth Igiri
  June 16, 2020 9:28 am
  
  lolz. Why is this the worst?
  
  Reply
J Roes
June 16, 2020 8:27 am

Setting up an ISCSI source to an Iomega external drive on my production database so I could quickly increase the space available. Worked great until the drive suddenly died and poof, away went some databases. Oh, and not having current backups but I guess that’s more of an oops. Got away with that one because the databases were just historical records that either weren’t needed anymore or could be re-created but…

Reply
- Brent Ozar
  June 16, 2020 8:39 am
  
  WOOOOOW, ouch!
  
  Reply
Jerry Manioci
June 16, 2020 9:09 am

Implementing a 16-node Polyserve cluster for a very busy and important set of sql instances. It was regular hell until we migrated to a different HA solution.

Reply
Nevermore
June 16, 2020 9:12 am

Back when I was a junior DBA I given a task to bulk load some users into a table. It went wrong and so I decided that restoring the database would be easier than fixing the problem. I didn’t take a backup because I had no intention of keeping the data that I’d just messed up. After performing the restore I realised that the last backup was over 3 months old – I’d just wiped out 13 weeks of data for a safety control system that supported a nuclear power plant. The bit I regret: it was my first IT job and I didn’t want to get fired, so I deleted all of the SQL Logs & Windows Logs. My boss knew it was me, but he couldn’t prove it. I’m still ashamed of doing that nearly 15 years later.

Reply
- Matt
  June 16, 2020 9:21 am
  
  first lesson i learned – own up to the mistakes you make.. sure, might still get fired, but at least your conscience is clean!
  
  Reply
Matthew Oates
June 16, 2020 9:14 am

I was troubleshooting a SQL Server Failover Cluster in PROD, and saw the option to “Validate Cluster..”

I ran it not realizing that it actually takes the cluster down/offline in order to “Validate”

Reply
Paul Nielsen
June 16, 2020 9:21 am

Way back, in the SQL 6.5 days, when I was learning SQL Server and first read about fill factor, I misunderstood the meaning of the value and set it to 1 across the board thinkin’ that 1 meant to leave 1% free and fill 99%. The test database blew up in size, began to crawl, and the developers began to yell.

Reply
Paul Nielsen
June 16, 2020 9:23 am

I once experimented with a poly-FK – meaning an FK that could point to PKs in multiple tables depending on the value in another column. Very dumb idea – a complexity that breeds complexities.

Reply
- Si
  June 16, 2020 10:33 am
  
  I think the technical term for that design pattern is actually “cluster-FK”, not “poly-FK” 🙂
  
  Reply
Jeff Moden
June 16, 2020 9:24 am

Heh.,. your second biggest mistake must be that hat, Brent. 😀 At least it doesn’t look like one of Steve Jones’ shirts. 😀 😀 😀

Reply
- Brent Ozar
  June 16, 2020 10:30 am
  
  HEY I LOVE THAT HAT
  
  Reply
- Si
  June 16, 2020 10:35 am
  
  Your mission is now to snap a photo of Brent wearing that hat standing behind Steve wearing one of his shirt so that is looks like Steve is wearing both. Y’know, whenever conferences are allowed again.
  
  Reply
Angele
June 16, 2020 9:33 am

Did an inplace upgrade of 2012 to 2014 with as,ssis all on the same machine…………….downtime 3 hours.

Reply
- Brent Ozar
  June 16, 2020 10:30 am
  
  Ooo yeah, in-place upgrades seem so simple at first…
  
  Reply
  - Ernest Libertucci
    June 17, 2020 9:56 am
    
    aka “Why we all like VM snapshots and/or dev environments”
    
    Reply
Yoasmy Ortiz
June 16, 2020 9:44 am

Hi, I’m junior DBA, I have a year in this career.
My worst decision have been Restart server production without stop the SQL Server service.
The biggest database in the server was stuck in suspect state. My partner and I had to restore the last backup around 500 GB.
I’m working in my English, Regards from Colombia

Reply
- Brent Ozar
  June 16, 2020 10:29 am
  
  Yoasmy – that should actually be okay. My guess is that something else went wrong along the way. Take heart!
  
  Reply
Tom Uellner
June 16, 2020 9:46 am

I’ll let you know when I’m done setting up Always On Availability Groups… 😉

Reply
Caleb Acton
June 16, 2020 9:53 am

sp_configure ‘lightweight pooling’ 1 …. Back in my early DBA days on SQL2000 I had recently read some interesting articles about how people were configuring their SQL servers for max TPCC results. At the time I was managing a multi core system in an active / active cluster that was doing heavy OLTP activity and routinely running 5,000 context switches/sec. This fiber mode setting seemed like it could really make a big difference. I promptly made the change in Production (Dev didn’t have the load) and waited. Nothing much really happened until a week later when the system fell on its face. Once it started getting any type of load the wheels fell off. We limped the week along through various Microsoft calls and changes.. finally after a week my Boss said lets just try flipping that setting back even though it was changed a week before we had issues. Viola! it worked, service restored. What happened you ask? Yep you guessed it! A week after the configuration change the server was rebooted for patches and the configuration went live. I didn’t read the fine print about that configuration setting requiring a restart.

Reply
- Brent Ozar
  June 16, 2020 10:29 am
  
  Caleb – ouch. Lightweight Pooling sounds so attractive, too…
  
  Reply
James
June 16, 2020 10:02 am

importing SSIDB packages to a new server with a script of the internet, not noticing that the script dropped all the ones that were already there.

Reply
- Brent Ozar
  June 16, 2020 10:29 am
  
  James – OUCH. I get so worried when I see folks copy/paste stuff straight off my site, too.
  
  Reply
Eric Eyster
June 16, 2020 10:03 am

I trusted the server team with the database backups. Do I need to say more?

Reply
- Brent Ozar
  June 16, 2020 10:28 am
  
  Eric – HAHAHA, been there, yeah.
  
  Reply
- Dave
  June 16, 2020 9:49 pm
  
  Actually that was my biggest thinking about it. I asked for backup and everyone went quiet turns out Netbackup I think we had, hadn’t been working correctly for 2 weeks.
  
  Reply
Marc S
June 16, 2020 10:17 am

On my first big development project I created an index on the equivalent of “Processing Date” in DESCENDING order because, well, the most recent ones, which are queried the most, would always be at the front of the line resulting in faster searches, of course. After 3 days performance went in the tank and we had to reindex those tables weekly. I had no idea about how SQL stores and uses indexes back then.

Reply
- Brent Ozar
  June 16, 2020 10:28 am
  
  Marc – that makes sense. I can see how you’d come up with the idea for that index.
  
  Reply
Mike
June 16, 2020 10:21 am

Hmm. I guess it’d be the things people complain about the most or that I have to keep dealing with. One in particular is helping architect an OLTP system that needed lots of historical data, so it half sits on our DW server. The developers hate me for it and the users hate the weird issues due to poor understanding of SCD type 2. 🙂 Good times.

Reply
Joe O'Connor
June 16, 2020 10:53 am

create an index “with drop existing” in a busy database. Held all the transactions for the duration. Two. separate. statements. next. time. (at least I get a next time!)

Reply
Aaron
June 16, 2020 11:04 am

Used a dash in the AG listener name.

Reply
- Tom Uellner
  June 16, 2020 12:30 pm
  
  Aaron – Thanks for posting this. We’re in the process of setting up AG for the first time and this is one more thing to add to my prep list.
  
  Reply
- KEith
  June 18, 2020 11:30 am
  
  At the expert level you got to try doing them with two consecutive dashes. 😛
  
  Reply
  - Lars Rasmussen
    June 22, 2020 9:38 am
    
    What’s the problem with using a dash in a listener name? Yes, I’ve done that/do that. I also use CNAMEs in the connection strings that point to the listener names so the conn strings don’t ever have to be changed, just repointed to a new listener the next time you upgrade SQL Server.
    
    Reply
Edgar Cayce
June 16, 2020 11:15 am

So, startup company asks me to spec out a plan for their new website/system. They then have it built by offshore devs. Later they bring me in as CTO to run things and support this system. The devs decided that it would be more “secure” if every client company was provisioned their own SQL database, with identical copies of all of the tables. Whole site was built around this idea.

With 5, 10 clients, it was not a big deal but once it got over 100, it started to be a big mess and my mistake was putting off the code changes required to merge it all into one DB.

We finally did it when it was 250 clients, and it was a big deal because by then so much code had been built around this!

Reply
- Gary Smith
  June 18, 2020 8:06 am
  
  I’ve been a DBA for about 4 years now for one company, and never did anything database before working here (spent a few years learning SQL as I did regular support calls and then Systems Administration work). The product in question uses SQL as a backend, and each customer does indeed have their own SQL instance. This means that in a group of 5 DBA’s (initially 3 when I moved to the role) we are maintaining around FOUR THOUSAND SQL instances. Once at a SQL Saturday the speaker had us keep our hands up as she went through more and more instances, and eventually gave up and asked us how many. At lunch later, someone approached me and asked how we manage that many. I explained that for us it wasn’t individually managing that many so much as keeping the software that it runs down to about 8 different versions that all work the same way.
  
  We occasionally get “this one site had a power outage and a DB is suspect/corrupt” type calls, but most of our problems are “this query in this software build is slow” or “this task needs to do this one thing, write it into the task.” In that sense, we’re more Database Developers than DBA’s. Fix it, deploy to every site running that build (and product deployment is handled by another group after testing is done by yet a different group). It’s a crazy way to do things, but for me it’s all I’ve known as a DBA. And our call volume is going down over the years because we’re getting in front of the build performance/stability problems, so we must be doing something right. 🙂
  
  Reply
  - Edgar Cayce
    June 18, 2020 11:11 am
    
    Holy moly….
    
    Reply
Blake
June 16, 2020 11:19 am

I keep outdoing myself in terms of “worst” mistakes, so I’ll just go with the one I’m grappling with now, namely the consequences of failing to migrate an early 90s database through successive editions of [ouch!] Microsoft Access. Of course 25+ years later, I find myself in a situation where I need to refer to that old thing. So, where is it? That part of the comedy ended when I blew the dust off an ancient backup drive I found lurking in the back of a cupboard (whew!). I was feeling pretty smug until I tried to load the thing and Access 2016 puked on it. Happily (no, make that sadly) I had saved some documentation with the old database file. Reading that documentation told me that in order to load the old database file all I need to do is build a VM and install Windows 3.1, which I of course no longer have. And even if I did have Windows 3.1, I’d need a DOS OS to run it on, which I of course no longer have. And even if I did have Windows 3.1 and a DOS OS, I’d need MS Access 2.0 to load the old database. Or I could maybe convert the thing by running it through successive versions of Access 97, Access 2003, Access 2010, etc. but you guessed it, I no longer have those either. There are a couple of morals here: (1) don’t just backup your databases–convert them if necessary as new versions of your DBMS come along; and (2) I am no longer laughing at my pack-rat friends who have basements full of old computers, floppy disc collections of software, ancient manuals, etc. etc.

Reply
- Alan Bird
  June 16, 2020 1:13 pm
  
  I feel your pain, just did that yesterday. Lucky, I had folders of the software CD/Floppies. But, unlucky no license codes saved with them.
  
  Reply
Alin Selicean
June 16, 2020 11:30 am

Ran multiple DBCC SHRINKFILE on different FGs until I hit a point when files were from the same FG. At that point… well, all hell broke loose… Never again.

Reply
GSquared
June 16, 2020 11:44 am

I needed to replace an index on a large table. So I ran:

DROP INDEX NCI_blah ON TABLE dbo.blahblah
GO
CREATE INDEX NCI_blah_v2 ON TABLE dbo.blahblah (…)

This was a production system, and this was started off-hours. The backout plan, if it didn’t work as desired, or if it ran too long, was to rollback the script. Sure, that’s single-threaded, and might take a few minutes, but we planned for that.

A few minutes before business-hours were to start, it wasn’t done. So I hit the “stop” button in SSMS and it happily rolled the transaction back in a reasonably timely fashion. Rolled “the” transaction back. The current transaction. The one AFTER the “GO” statement that separated this into multiple batches.

I didn’t realize it had rolled back the create, but not the preceding drop. The company websites and the most common in-house applications all ground to a halt.

Lesson learned, verify the darn DDL is in a good state before letting people use it. Transaction control, et al, is great, but one step beyond that is actually looking at the results and knowing what they should be vs what they are.

Reply
John Waclawski
June 16, 2020 11:53 am

Brent,

I too, as a fresh DBA, ran a profiler trace on production, during business hours. My manager had just shown me what it was, while we were at his computer. Little did I know he was probably on a dev server.
Fast forward later when I was left to learn the servers, I fired up the profiler and connected to our biggest production server. “Let’s see what I can see on this server. ” I thought.
A mere few minutes later I saw people around me panicking! Devs, managers even the CIO was asking, “Why has production screeched to a halt??” was everyone’ question.
A moment later my manager came to my cubicle, “Are you running a trace on production?”.
Innocent me, “Yeah?”
“Please stop it.” he calmly said and he walked away.
That was 11 years ago and we still joke about it. But I’m glad I’m in good company that I wasn’t the only one that has ever done that.
Great subject matter, Brent. Thanks!

Reply
Dave Shaw
June 16, 2020 12:48 pm

XML in a varchar column holding ~10kb of simple XML for the state of a system… and one element that was a Base64 String containing more GZipped XML data that could be 20MB or more – compressed!

Oh, and the whole XML was also custom BZipped and then Base64 encoded again – because it was varchar – TBF there wasn’t an XML column type back then!

Good times until I need to do an ad-hoc update to a value in the XML without using the application software (caused by a race condition updating the state in the XML) – cue a 20MB .sql file to change a few bits of XML that management studio wouldn’t even open.

That 10kb of state could easily have gone into a few normalised tables and the massive blob of data into a varbinary column compressed, but I was young and XML was all the rage when I did that.

I’m still living with the legacy of this today, the 2 XML’s are split out but the first one still lives on as B64 Bzip in a varchar and works perfectly.

Reply
Zach
June 16, 2020 12:57 pm

Promoting scalar UDFs in solutions that made their way to production code. I came from a non-SQL background in which that made sense for modularization/reusability, and I had no idea about the terrible effects they have on the query execution engine! Years later, there are STILL a couple of them out there needing to be cleaned up.

Reply
Steven Vandever
June 16, 2020 1:14 pm

When I first started as a DBA many, many years ago, I decided that the wizard was an easy method of creating maintenance jobs. Especially the ones I created for each and every database on each and every SQL server for rebuilding every index every night at the same time.

A coworker of mine, back when she used to be a customer, decided to run a query she found on the internet against her database that would do a wildcard search of every column and row of every table in a very large database, bringing the SQL server to a screeching halt and affecting a few dozen applications. She bought us apology donuts, so all good 🙂 We are cheap. She works for us now and she is pretty awesome.

Reply
- Steven Vandever
  June 16, 2020 1:16 pm
  
  Oh, and it’s possible I ran a shrink DB against those same databases after running the index rebuilds.
  
  Reply
Recce
June 16, 2020 2:13 pm

Worst DB decisions:

1. mid-90s and letting myself get talked into using RAID-5 for the SQL Servers

2. Thinking PDW would work as an effective tool

3. Using FK with cascade

4. Doing without FK

5. Letting a developer do the formatting of a Customs declaration report in T-SQL

6. Trying to enlist multiple sessions within the same transaction to the same database so they didn’t block each other longer than necessary. I think I was planning on using “Application” locks.

7. Trying to use ActiveX / OLE (??) object from T-SQL Stored Procedures to do an “out of transaction” call to get the next key sequence number from a table. This “next sequence key” was a major pinch point / blocker in the application resulting in the processing basically doing serial operations. SavePoints and a table with no rows was so much easier until SEQUENCES.

8. Trying to use SQL-CLR to do “out of transaction” logging. There is no way.

Reply
- Richard
  June 26, 2020 1:55 am
  
  Up-voting this 🙂 Can I throw in multiple (as in 7!!) instances of SQL 2000 / 2005 on Windows Server 2003 32-bit? And being talked into using high density Blades with slow iSCSI storage presented from a cheap entry level software “SAN” using 3.5″ SATA disks in RAID5 config…. Finally, using PRAGMA AUTONOMOUS TRANSACTION in Oracle to fix mutating trigger issues (walk of shame on that one especially).
  
  Reply
Recce
June 16, 2020 2:37 pm

Actual “ooops” mistake?
The weekend before I left a job, we upgraded from SQL2000 to SQL2008R2. Everything looked good and ran smoothly for my last week. And the PDW looked like it would be a good useful tool.

Three weeks later, two weeks after disabling / deleting my account I get a panicked phone call from my old boss as the server is running like a dog. My replacement is trying to fix by adding indexes to no avail.

My old boss (an Apache server admin) and I, track it down using the PDW to the replication distribution failing. Replication on the new server was set to use my account for some part of it. When I checked this as part of the upgrade (this was a step on the upgrade plan), I must have checked the old server.

Reply
Keith
June 16, 2020 4:24 pm

Actually probably my real worst was actually my first exposure to SQL 2000, around 2014 or so. I don’t remember exactly what I was doing, but believe I was copying data from one db to another and found a tool somewhere in enterprise manager that looked really handy, and thought, “Why doesn’t ssms have something this easy?” Turns out it blows away the entire destination database when you run it.

Reply
- Recce
  June 16, 2020 4:30 pm
  
  Hahaha!!!
  Yup, never blindly trust the tooling.
  Not the first and not the last person to get caught out.
  
  Reply
Paul
June 16, 2020 6:23 pm

As a very junior developer almost 30 years ago, I had made a change to the collation of our internal test setup to fix an issue and all worked as expected. Six months down the line, I was working on-site at a client (a large financial services institution) and came across the same issue. So, I changed the collation of the LIVE environment that also had several other system running against it. What could possibly go wrong ?!! In my defence, my supervisor should never have let me loose on the live environment. Needless to say, the on-site DBA was incandescent and I brought all their systems to a grinding halt for a couple of hours. Miraculously, I still kept my job.

Reply
Andrew Hill
June 16, 2020 6:42 pm

Almost any time during designing a database that someone has said “we don’t need to store that”.
Especially if it’s a timestamp.

Reply
- Andrew Hill
  June 16, 2020 6:42 pm
  
  Extra bonus points if someone says “using datetimeoffset rather than datetime will just complicate things”.
  
  Reply
Dave
June 16, 2020 9:43 pm

I had just built my first SQL AlwaysOn availability group cluster and wanted to add a third node, I was overconfident and ran ahead in the middle of the day and added the node, missed that little check box ‘Add all eligible storage to the cluster’ .

Brought the whole CRM environment down.. tense times, but like many have said i was honest owned up to it and got help from the sysadmins, he still mentions it 10 years later.

Reply
Andreas
June 17, 2020 12:29 am

My single biggest mistake? Not keeping a list….
A list of the things that went wrong after thinking “well, what could possibly go wrong if I do this”.

Reply
Erik Dahlen
June 17, 2020 4:40 am

I once at a meeting with a new customer, which was a rather small business. And during the meeting, I told them about Express and it would be enough for them.

Later on, they refused to buy a license.

I don’t even know where to start with all the features they wanted me to fix for them anyway. (Everything from backups without the Agent to 10 GB per database.) It was a mess, but I learned a lot.

From that day, I never recommend Express. Databases will always grow.

Reply
- Thomas Franz
  June 18, 2020 6:47 am
  
  for backups it is easy – just use the scheduler from Windows and sqlcmd (I do it on one of our more important SQL Express instances)
  
  Reply
  - CHRIS HOLDING
    June 20, 2020 5:18 am
    
    I’m using FileStream to store BLOBs on Express as a means to reduce licensing costs. Not seen any gotchas…yet
    
    Reply
Dave
June 17, 2020 6:15 am

Saturday morning, making a large data change to a table. Outage window was until about midday. After 2 or 3 hours or so, the transaction log had grown so large, that it had used up all the remaining space on its drive. I can’t remember how large it had got, but it was in the 100’s of GB’s. I thought about calling the storage admin, but it was Saturday and didn’t want to disturb him, even though he could have simple jumped on the VPN and provisioned more storage in about 10 minutes and allowed the process to finish, probably well within the change window. I just said to myself, “oh well, this change isn’t going to work, I’ll just kill the transaction and attempt it again when we’ve sorted the disks out another day”. I knew the transaction had to roll back, but thought, it can’t take that long can it?

I think it finished rolling back about 2 am Sunday and the app database was unavailable the whole time.

Reply
pmpjr
June 17, 2020 9:50 am

I copied and pasted 1000s of ldf/mdf over a weekend to a migration. It was like 60 hours. I knew not enough of what I was doing. AMA

Reply
Rasmus Schultz
June 17, 2020 9:46 pm

I used an ORM. I then, of course, used another ORM, and more ORMs, in different programming languages, for years and years, over and over, and nothing ever improved – it was merely bad in different ways, every time.

The best decision I ever made was learn to love SQL and get along with just a simple query builder.

Now, years later, I’m using a popular, mainstream ORM again (not by choice) and it’s bad in all the same ways that ORM was always bad, and always will be.

This entire industry needs to get over the empty, meaningless pursuit of better ORM and learn to love SQL again.

Reply
Rob
June 18, 2020 2:53 am

In the light of this (recent) European GDPR, I think one of my biggest mistakes is allowing all kinds of Email columns to safely breathe in the databases that I currently manage. Emails, like addresses, like people names, like A LOT of other personal stuff (cars’ license plates, another example), should explicitly exists in ONE SINGLE place (aka a table for each entity) across your database instance.
That would have save a lot of my time when dealing with my Production, Pre-production and Development environments, when trying to reach the GDPR compliance. Heck, that would have even save me from becoming a DPO !!!!

Reply
Todd Salzman
June 18, 2020 6:43 am

The biggest regret was one time when we were in the middle of setting up a DR environment and the infrastructure guy asked me if I wanted to start pushing the files for the largest sql db during the week. I looked at him and said, I don’t want to impact the processing during the week, can we wait until the weekend? Wouldn’t you know it, the next day the storage array hardware guy came in and updated the storage array firmware and noticed a drive was suspect so he hot swapped it. Which of course triggered a bug in that new firmware and killed the array. Spent that whole weekend rebuilding the largest database from backups and logs. It all could have been avoided if I had just let the infrastructure guy push those files to the DR site during production hours.

Reply
Thomas Franz
June 18, 2020 6:44 am

Whats about inserting 100k rows from the clipboard (Excel) in the SSMS Edit top 200 rows window.

It takes forever (since it inserts it row by row), blocks SSMS, Excel (sometimes at least) and of course you should / can not use the clipboard…

Reply
- Wanderlei Santos
  June 18, 2020 8:17 am
  
  Yeah, but did you do it through a VPN?
  
  Reply
Karl
June 19, 2020 5:25 pm

I saw that guest user account had application level rights during security audit. I had just been hired after six months probation with out incident. I was cleaning up orphaned account and removed access from guest account. I was told if I made even one mistake I would be let go. Completed a large SQL server migration project from on premises to data-center with 100s of ETLs from Oracle to MS and BACK A possible nightmare of a scenario completed with out incident and on time. Smooth as butter in my first week. I thought easy peazey. I thought would clean up some accounts I saw orphaned after SQL Migration. Well… I removed some of the guest accounts rights to Production SQL Server ” Per Best Practices”. All the internal users plus the Public Facing application shared access thru the guest account. WTF? I reversed the problem within five minutes. I was let go at the end of the week with an impromptu meeting request where the VP showed up at my desk with a card board box. It was a nice 6 months and two weeks. 😀

Reply
- Brent Ozar
  June 20, 2020 3:59 pm
  
  OUCH
  
  Reply
- Brett Shearer
  June 30, 2020 9:59 pm
  
  The best thing about letting people go after they make a mistake is that you get REALLY good at it and it is a skill you can use for the rest of your career 😉
  Everyone makes mistakes. The trick is to learn from them – it’s called experience.
  
  Reply
  - Karl
    July 15, 2020 6:42 pm
    
    Years of experience tell me your special and gifted with a wonderful sense of humor only out weighed by your years of experience and profound advice. So clever! You will go far. Hopefully very very very far.
    
    Thank you 🙂
    
    Reply
Sanket Chakane
June 23, 2020 8:31 am

I did the store proc backup (which is more than 1000 of lines) using sp_helptext before replacing with new one in prod of db more than 3 tb and after some time user ask me revert it back and while reverting I got n no of syntax errors and I sat over many hours but still errors were still coming. At last that sp was reverted from backup which was already restored on other server a week before

Reply
GMas
June 24, 2020 5:14 am

Back when I didn’t know any better (or anything) I suggested to move a DB file out of the system drive (so far so good) using DBCC SHRINKFILE WITH EMPTYFILE, rather simply than reading the documentation page about moving db files.
Did I mention I was the only (highly accidental) DBA at that time?
Cue in 12+ hours of monitoring the process, and general misery.

Reply
Greg
June 29, 2020 4:09 pm

The first time I had a really long running rollback (like hours in already) and decided the way to fix the problem was to restart the server…didn’t fix the problem.

Reply
David
July 16, 2020 8:53 am

At the time I was working in a small IT shop where we all wore many hats. My IT manager was on vacation and I was helping the Dell technician who configured our new rack with shiny new database server, RAID array, app servers, DW server etc.
As he left in early morning hours we were about to power up the server when I found that the whole rack was simply plugged into the wall. Hmm I thought, the technician just plugged it in not realizing that we had UPS. Off I went, mentally brushing my well obsolete MCSE, and plugged it into our UPS. We turned on our equipment one by one, and everything went well until later in the day when the load reached its peak and suddenly the server shut down.
That rack was dedicated to a major bank which was our main customer driving 45% of our revenues. The phones started ringing and CEO came to the server room where I was scratching my head. The UPS was under-powered and shutdown the server when it couldn’t provide enough power. I explained it to my red-eyed CEO. I told him that it was unexpected that my manager would not match the new equipment power requirements with properly sized UPS. It was going to be a $5000 UPS to supply power to over million dollars sitting in that rack. He angerly retorted that Bill (a false name) made that request, but it was more like a wish list so it was rejected.
The moral of this story is that a DBA should never mess with the system stuff. Leave it to the system people and ask no questions. Realizing that I was slated for the chopping table at the first opportune moment, I left the company a few months later.

Reply
Vikas Panjabi
July 16, 2020 10:27 pm

Hehe, I made similar mistake running profiler just few weeks ago. Got a call from my manager too as the production went from slow to dead in waters for 20 minutes. This was my worst mistake and I don’t know if it was profiler or vpn. From that day onwards my approach towards production systems has been completely different.

Reply
Joe Gleason
November 10, 2020 2:22 pm

We are converting from Teradata to MSSQL, so I want to try and avoid a big mistake…

When our applications are submitting SQL via ADO.Net, with input parameters, the parameter definitions are defaulting to nvarchar. If that field is defined as a varchar and an index on the database table, the index is not used resulting in a scan instead of a seek. We are converting from Teradata to SQL Server so this is a systemic issue at this point in the conversion. The applications team submitted this to me:

When we define anything as a String in Code, the ADO.Net provider automatically assumes that is a
NVarchar in SQLServer.

One of their solutions is to only remedy this for tables over 1,000 rows. I think this is faulty on so many levels, but am looking for some additional input. I am a Teradata DBA transitioning to a MSSQL DBA. I would assume this setting in ADO.Net would be configurable. To me it is obvious the Input parameter definition needs to match the field definition in the table, especially if it is part of an index resulting a a full table scan. Can anybody help me with (1) how to set the input parameter definition to match the table definition?, and (2) if this is systemic, why would it be a bad idea to only fix those parameters and queries if the table is over 1,000 records in size?

Reply
- Brent Ozar
  November 10, 2020 2:26 pm
  
  Joe – for coding questions, don’t rely on blog comments. You’ll want to post that on a Q&A site.
  
  Reply
Joe Gleason
November 10, 2020 2:41 pm

ok… thanks. I’n new here and still trying to figure it all out. I emailed you once and you told me to go to blog comments. I have posted this question on a couple of forums. I was just curious to know what anybody here had to say. I will learn eventually! Ha

Reply