Site Reliability Engineer 系统工程师
安客诚全球信息服务(南通)有限公司
- 公司规模:150-500人
- 公司性质:外资(欧美)
- 公司行业:计算机软件 计算机服务(系统、数据服务、维修)
职位信息
- 发布日期:2014-04-16
- 工作地点:南通
- 招聘人数:3
- 工作经验:五年以上
- 学历要求:本科
- 语言要求:英语熟练
英语熟练 - 职位月薪:面议
- 职位类别:技术支持/维护经理 信息技术经理/主管
职位描述
The global leader in interactive marketing services, Acxiom connects clients with their customers through deep consumer insight that enables effective and profitable marketing initiatives and business decisions. Our consultative approach spans multiple industries and incorporates decades of experience in consumer data and analytics, information technology, data integration, and consulting solutions for effective marketing across digital, Internet, email, mobile and direct mail channels.
Role Overview:
The Site Reliability Engineer operates Acxiom's complex, high traffic, business critical internet site communications and/or network-based (cloud) product systems. They are responsible for system performance, capacity planning, writing installation scripts and programs for installation of products. They create the Run Book for front line support services and works with the development teams to enhance system operability (defect/defend). Conducts system validation test post application installs and drives automation enhancements for deployment efficiencies. Provide problem resolution/triage service for not only restoring service, but feeding operations enhancements back to the Product organization.
Key Responsibilities:
The Site Reliability Engineers (SRE) are tasked with keeping Acxiom systems up and running 24 hours per day, 365 days per year to support billions of transactions. SRE's develop automated tools to provision servers, deploy software packages, monitor system health, detect and repair applications and servers. We also track performance issues and look at long-term trends to correct issues and look for ways to make our systems run even faster and more efficiently
The SRE will require strong systems administration skills, a solid engineering background and hands-on experience in high volume, high transactional environments. The SRE must be comfortable using the operations tools to get the job done and also enjoy researching, developing, and testing new ideas and tools to improve site performance, maintenance and reliability.
It is imperative that the SRE works well in a fast-paced collaborative environment and be able to apply critical thinking and have strong problem solving skills in complex production environment scenarios to ensure high availability. And finally, must be customer focused with a passion for service and driving for results.
Work closely with other engineers to guide design and reliability discussion to maintain fault-tolerant systems. Use various internal and external monitoring tools and systems to track production uptime and performance:
?Participate in capacity planning/analysis and performance analysis activities
?Key contributor to enabling our continuous delivery model
?Troubleshoot issues across the entire stack from network, firewall, load balancers, authentication, to application services
?Manage our hybrid cloud environments
?Automate prcodeures
Key Skills (including any technical skills):
Essential:
?Strong experience with Linux/Unix
?AWS, Rackspace and/or other public/private cloud experience
?Practical knowledge of shell scripting and at least one of the following scripting languages - Python, Ruby, Perl
?Capable of prioritizing efforts with minimum supervision
?Track record of practical solving and excellent communication and organization skills
?Tack-sharp analytical abilities, coupled with a strong sense of ownership, urgency, and drive
?Ability to handle periodic on-call duty, preferably with prior experience supporting a large scale web service
?Automation experience with system provisioning and application deployment
What will set you apart:
?Experience with end-to-end solutions, including: thorough understanding of TCP/IP stack and general networking, network and OS security
?Passion for open source communities and philosophies , LDAP, DNS(BIND), YUM
?Software engineering background with experience in Java
?Puppet and/or Chef
Desirable:
Proficient with the installation practices applicable to the OS its running
Proficient understanding of the tools used to monitor system health
Able to prioritize deliverables appropriately
Multitasks to ensure deliverables are completed in a timely fashion
Advanced, proven knowledge of Acxiom products
Competencies & Experience:
May be required to provide on-call service coverage with other department employees.
Fully functional and self-directed
Provides formal mentorship
Owner of high complexity assignments
Low complexity assignments - provide oversight/review
Regularly lead self and others and/or established as Product SME and/or established as specialist
Passion
Accountability
Team-working
Creativity
公司介绍
独一无二地融合了信任、经验和规模来产生数据驱动的营销。凭借闻名全球的 营销数据库与数据技术,我们为财富100强中的47家企业提供营销洞察力。今天我们每周处理超过一万亿数据,并专注于创新技术,使客户得以扩展这种洞察力到其媒体投资和营销合作伙伴生态系统(先进的数据分析、数字媒体平台和 多渠道营销整合)。
联系方式
- 公司地址:上班地址:南通市崇川区紫琅路120-1号