With Docker 1.12 in swarm mode, there’s no need for etcd, Consul or Zookeeper. The built-in distributed key/value store manages the orchestration for you so building a cluster for development and testing is a breeze. However, this does not preclude the role of ops.
To quote Solomon Hykes,
Postgres served us well in the early days of Uber, but we ran into significant problems scaling Postgres with our growth. Today, we have some legacy Postgres instances, but the bulk of our databases are either built on top of MySQL (typically using our Schemaless layer) or, in some specialized cases, NoSQL databases like Cassandra. We are generally quite happy with MySQL, and we may have more blog articles in the future explaining some of its more advanced uses at Uber.
The limitation of the current version of Docker Swarm 1.12-rc4 is the number of tasks handled by the managers.
This number is around 95,000.
It’s actually a very large number and, in practice, you probably won’t need that many tasks.
This seems to be a Swarm limitation as each node was running only 40-45 tasks as reported by contributors. The maximum number of containers allowed on the smallest node (512MB of RAM) is around 60-70 container.
a note for the ones who think cost and complexity put building a Ceph cluster out of reach. The picture below shows my home cluster (which I use quite heavily). The cluster comprises four ARM-based nodes (Odroid-XU4), each with a two TB portable USB-3 hard disk, a 16 GB EMMC flash disk and a gigabit Ethernet port.
The use of containers needs a strong supervision with different metrics than traditional VMs. For monitor Docker hosts, I use a stack of InfluxDB a time-series database, Grafana the data visualiser and finally Telegraf to ship our metrics from few hosts.
In reality, the majority of work that a data scientist does day-to-day is NOT training neural nets to play board games. A data scientist spends 90% of his/her time cleaning data, organizing data, gathering data, parsing data, and extracting field/patterns from data (see here for evidence supporting that). Go has already proven extremely useful and efficient for these tasks! Then for the remaining 10% of data science work which includes training algorithms etc., Go already has a number of options. Its just that you might have to spend a little time finding the right things or implementing something here or there (see below for more resources as well), which is worth it for the huge gains in the other 90% of my tasks.
Most software documentation is all telling and not showing,” said Jay Hanlon, vice president of community for Stack Overflow. “We’re asking the community to upload examples of how to make the software work, to show learners how to make the software work for them.
IT content is free courtesy of the Web. Now, it’s getting easier. But don’t believe everything on the Internet!
Happy Sysadmin Day
- Chapter 1: An Introduction to Monitoring
- Chapter 2: Monitoring, Metrics and Measurement
- Chapter 3: Events and metrics with Riemann
- Chapter 4: Storing and graphing metrics, including Graphite and Grafana
- Chapter 5: Host-based monitoring with collectd
- Chapter 6: Monitoring hosts and services
- Chapter 7: Containers – another kind of host
- Chapter 8: Logs and Logging, covering structure logging and the ELK stack
- Chapter 9: Building monitored applications
- Chapter 10: Alerting and Alert Management
- Chapters 11-13: Monitoring an application and stack
- Appendix A: An introduction to Clojure
What was the goal in publishing this book?
I wrote the book as a framework, as a potential approach. I chose some technologies that I like, but I also recommend a whole bunch of other technologies and so I provide the pros and cons of those alternatives.
Hopefully, it will provoke ideas and make you think about how you monitor your systems. At the very least, it should get people to think about they could be doing monitoring differently. That was my intention, is to present a roadmap towards leveling up your monitoring rather than a technology guide on how to implement some tools.
Other high-profile examples of SQL injection include an instance when NASA sites were hacked in 2009, yielding site administrator info; when Heartland Payment systems were rummaged in 2008, resulting in the exposure of 134 million credit cards; and earlier this year, when the high-profile Ashley Madison leak occurred, many experts’ first thought was “SQL injection” (though that was later stated not to be the case). In 2012, Neira Jones, the head of payment security at Barclaycard cited SQL injections as responsible for 97 percent of all data breaches. And the Open Web Application Security Project called SQL injections the number one most prevalent attack in 2013.
Note that you can also build your own proxy-based solution to detect SQL injection (read here)
After introducing and using Aerospike for our first use case, we are now using it for various use-cases too. Apart from the LDT feature, we are now using the lists, maps and of course, the normal key-value.
Due to its master-master replication model, we do not have to worry about rebalancing, failover or recovery! This has definitely pleased our DevOps team. 😉
Aerospike has been performing well, so far. We are still exploring it, but it has definitely added a new dimension to our architecture!