This topic is resolved

Share

This topic contains 2 replies, has 3 voices, and was last updated by  Swapnil Kambli 4 months, 2 weeks ago.

  • Author
    Posts
  • #1161326
     Bargavi 
    Participant
    Member Points: 125
    Rank: Level 1

    I would like to launch multiple Amazon EC2 spot instances (fleet?) using a custom AMI (docker?) for performing a deep-learning training task. I would like all the instances to share a common set of files for the purposes of training the model.

    The idea here is not to lose training history and keep a backup in EBS (network drive?) AWS Certified when the spot instance is terminated by AWS due to pricing-limit/demand. The task state can be updated in a file and then resumed when instances are available.

    Is it possible to launch all instances and let them work cooperatively to complete the training task? What kind of setup could accomplish this?

    1+

    Users who have liked this topic:

    • avatar
  • #1161342
     Michael Pietroforte 
    Keymaster
    Post count: 1766
    Member Points: 21,663
    Author of the year 2018
    Rank: Level 1

    It depends on the size of the project. With AWS ClouldFormation you can coordinate the provisioning of multiple AWS resources.

    I am curious. What deep learning task is this? Which software are using?

    0
  • #1161352
     Swapnil Kambli 
    Moderator
    Post count: 48
    Member Points: 4,750
    Rank: Level 1

    Hi Bargavi,

    Spot Fleet would do the trick for you to launch and maintain the number of machines running.

    Here is the step by step tutorial for the similar requirement that you have.

    https://aws.amazon.com/blogs/machine-learning/train-deep-learning-models-on-gpus-using-amazon-ec2-spot-instances/

    Kindly reply back if you need to tweak or customize the solution.

    1+

    Users who have liked this topic:

    • avatar

You must be logged in to reply to this topic.

© 4sysops 2006 - 2019

CONTACT US

Please ask IT administration questions in the forums. Any other messages are welcome.

Sending

Log in with your credentials

or    

Forgot your details?

Create Account